[Qt-interest] Accented characters and QDom in Windows
Andreas Pakulat
apaku at gmx.de
Thu Aug 20 21:02:36 CEST 2009
On 20.08.09 12:11:13, Ellen Kestrel wrote:
> Ah, good - how do I indicate encodings in the XML - is it specified with the
> document type?
The leading <?xml ... ?> processing instruction can contain an encoding
attribute indicating the encoding. The XML spec says that if no encoding
is given, then utf-8 is used.
> I have just been putting strings meant for humans to read
> there, as the examples use things like "mydocument" and "MyML". This is the
> first time I've really played with XML, actually.
Thats not the problem, the problem is when you write out your DOM to
disk. At that point you should leave the writing either to QDom* or make
100% sure you're using the right encoding when writing. So lets look at
your code...
> Writing XML:
> QFile file (filename + ".cdic");
> QTextStream out (&file);
>
> if (!file.open (QFile::WriteOnly | QFile::Text)) {
> QMessageBox::warning (this, "", "Cannot write file.");
> return;
> }
>
> out << doc.toString ();
>
> file.close ();
>
> Or is it to do with the QTextStream?
Yeap. If you look at the QTextStream API docs, in particular those for
the operator<< overload you use (the one taking a QString):
http://doc.trolltech.com/4.5/qtextstream.html#operator-lt-lt-4
You'll notice that it says "assigned codec" and "default codec", this is
where the problem comes from as the default encoding is determined by
your locale. This is usually UTF-8 on modern Linux systems, but
something similar to latin1 on windows in europe. However QDomDocument
doesn't know this, so it doesn't create an encoding-attribute.
Now if an xml parser is told to parse the given file (as you also do in
the reading code) it see's no encoding-attribute and thus expects utf-8
encoded data. But your file is not utf-8 encoded.
Long story short, don't use the toString() function on QDomDocument,
instead use toByteArray() and store that in the QTextStream. Or bypass
the QTextStream completely and directly write the QByteArray to the file
using QFile::write(). Then the file will have utf-8 encoded xml in it.
Andreas
--
Ships are safe in harbor, but they were never meant to stay there.
More information about the Qt-interest-old
mailing list