[Qt-interest] Qt4.5.1: QTextStream input with 16bit characters
Constantin Makshin
cmakshin at gmail.com
Mon Aug 16 11:39:09 CEST 2010
If you look at QTextStream sources (e.g., at http://qt.gitorious.org/qt/qt/blobs/4.7/src/corelib/io/qtextstream.cpp), you'll find that for some reason it prefers not to use the QTextCodec::IgnoreHeader conversion flag (the one that tells the codec to ignore the BOM) for reading.
Description of the QTextStream::setAutoDetectUnicode() function says "It is common to set the codec to UTF-8, and then enable UTF-16 detection". So this may help you:
stream.setCodec("UTF-8");
stream.setAutoDetectUnicode(true);
On Monday 09 August 2010 19:25:10 Rainer Wiesenfarth wrote:
>
> I have a problem with Qt4.5.1 and text files containing 16bit characters. A
> hex representation of the files content looks like this:
>
> FFFE 2200 3100 2200 2000 3700 0D00 0A00
> 2200 3200 2200 2000 3800 0D00 0A00
>
> The first character of the file is the UTF Byte Order Mark. If i try to read
> this file using the code below, I get the two lines (indented for
> readability):
>
> "1" 7
> "2" 8
>
> Here is the code snippet:
>
> QFile file (filename);
>
> if (file.open (QIODevice::ReadOnly | QIODevice::Text))
> {
> QTextStream stream (&file);
> QString line;
>
> while (! stream.atEnd ())
> {
> line = stream.readLine ();
>
> if (line.startsWith (QChar::ByteOrderMark))
> {
> // This line is not reached
> line = line.mid (1);
> }
> }
> }
>
> I would have assumed that QTextStream strips the Byte Order Mark character
> (as it picks the appropriate QTextCodec for the file), but it does not. Even
> worse, removing the Byte Order Mark with QString::startsWith() fails also,
> as the character got converted.
>
> My question: Where is my mistake? Or is this a (probably already solved)
> "feature" of Qt4.5.1? Any guesses how to solve this?
>
> Best Regards / Mit freundlichen Grüßen
> Rainer Wiesenfarth
More information about the Qt-interest-old
mailing list