[Qt-interest] QString::toLower() issues with two-byte chars

Thiago Macieira thiago at kde.org
Fri Mar 12 16:11:25 CET 2010


Em Sexta-feira 12. Março 2010, às 03.50.28, Øyvind Vågen Jægtnes escreveu:
> Thank you, they both worked like a charm!
> I forget that not everyone are sitting at a UTF-8 enabled terminal ;)
> 
> But this brings up another issue that might come down the road. What
> happens if someone runs this program in a latin1 terminal and inputs
> the same types of chars? Is there a way to detect the encoding of the
> terminal one is running at?

Your issue has nothing to do with the terminal/locale encoding. It's the 
source file encoding. Qt has separate settings for each of those two.

The locale encoding is QTextCodec::codecForLocale and is automatically 
detected. Whenever you use QString::fromLocal8Bit and QString::toLocal8bit, 
you're going through the locale mapper. Whenever you read data from "the 
terminal", like you said, you have to go through this mapper too, to obtain 
proper Unicode strings.

The source file encoding is QTextCodec::codecForCStrings and is not 
automatically detected. It defaults to Latin 1. QString::fromAscii, 
QString::toAscii and all the QString conversions to and from const char* and 
QByteArray use this encoding. And that was your problem: your source code was 
UTF-8, but QString interpreted your byte array as Latin 1. So instead of 
reading "æøå", the string was actually "æøå".

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
  Senior Product Manager - Nokia, Qt Development Frameworks
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
Url : http://lists.qt-project.org/pipermail/qt-interest-old/attachments/20100312/e9531f42/attachment.bin 


More information about the Qt-interest-old mailing list