[Qt-interest] Qt4.5.1: QTextStream input with 16bit characters

Constantin Makshin cmakshin at gmail.com
Mon Aug 16 11:39:09 CEST 2010


If you look at QTextStream sources (e.g., at http://qt.gitorious.org/qt/qt/blobs/4.7/src/corelib/io/qtextstream.cpp), you'll find that for some reason it prefers not to use the QTextCodec::IgnoreHeader conversion flag (the one that tells the codec to ignore the BOM) for reading.

Description of the QTextStream::setAutoDetectUnicode() function says "It is common to set the codec to UTF-8, and then enable UTF-16 detection". So this may help you:

stream.setCodec("UTF-8");
stream.setAutoDetectUnicode(true);

On Monday 09 August 2010 19:25:10 Rainer Wiesenfarth wrote:
> 
> I have a problem with Qt4.5.1 and text files containing 16bit characters. A
> hex representation of the files content looks like this:
> 
>   FFFE 2200 3100 2200 2000 3700 0D00 0A00
>   2200 3200 2200 2000 3800 0D00 0A00
> 
> The first character of the file is the UTF Byte Order Mark. If i try to read
> this file using the code below, I get the two lines (indented for
> readability):
> 
>   "1" 7
>   "2" 8
> 
> Here is the code snippet:
> 
>   QFile file (filename);
> 
>   if (file.open (QIODevice::ReadOnly | QIODevice::Text))
>   {
>     QTextStream stream (&file);
>     QString     line;
> 
>     while (! stream.atEnd ())
>     {
>       line = stream.readLine ();
> 
>       if (line.startsWith (QChar::ByteOrderMark))
>       {
>         // This line is not reached
>         line = line.mid (1);
>       }
>     }
>   }
> 
> I would have assumed that QTextStream strips the Byte Order Mark character
> (as it picks the appropriate QTextCodec for the file), but it does not. Even
> worse, removing the Byte Order Mark with QString::startsWith() fails also,
> as the character got converted.
> 
> My question: Where is my mistake? Or is this a (probably already solved)
> "feature" of Qt4.5.1? Any guesses how to solve this?
> 
> Best Regards / Mit freundlichen Grüßen
> Rainer Wiesenfarth



More information about the Qt-interest-old mailing list