[Qt-interest] Have QXmlStreameWriter write the encoding information

Andreas Pakulat apaku at gmx.de
Fri Jun 5 20:18:43 CEST 2009


On 05.06.09 00:19:36, Lingfa Yang wrote:
> Andreas Pakulat wrote:
> > On 04.06.09 01:19:57, Aleksandar Lazic wrote:
> >   
> >> On Mit 03.06.2009 23:08, Andreas Pakulat wrote:
> >>     
> >>> In fact, Qt's XmlStreamReader won't read the result back in, see this
> >>> example:
> >>>
> >>> ,----[ main.cpp ]-
> >>> | #include <QtCore>
> >>> | #include <QDebug>
> >>> | #include <QtXml>
> >>> | 
> >>> | int main(int argc, char** argv)
> >>> | {
> >>> |     QString tmp;
> >>> |     QXmlStreamWriter w(&tmp);
> >>>       
> >> Please can you try this:
> >>     
> >
> > I know that works, I was proving to Lingfa that adding a PI that includes
> > the encoding will still get you a broken XML if you use QString as output
> > for the writer.
> >
> > Andreas
> >
> >   
> No, it not true. No broken XML when  I use either QFile, QString, or 

Lets keep to QString because thats the one that is different

>   {
>     QString s;
> 
>     if (file.open(QIODevice::WriteOnly | QIODevice::Text)) {
>       QXmlStreamWriter w(&s);  // 2
>       w.writeProcessingInstruction("xml", "version=\"1.0\" 
> encoding=\"UTF-8\"");

Well, that works, but if you look at my example you'll notice that I've
been keeping close to the example which use writeStartDocument.
 
> I red your code where "broken xml" is generate because, first, you 
> should not use:
>      w.setCodec("ISO-8859-1");
> which is not a supported encoding,

Apart from the fact that I'm not doing that, I'm actually doing

w.setCodec( QTextCodec::codecForName( "ISO-8859-1" )

I don't see why latin1 would not be a supported codec to be used for
XML. The spec doesn't say anything about.

> at least, up to Qt4.5.0. Neither 
> "ISO-8859-1" nor any of its possible aliases,"latin1", "CP819", 
> "IBM819", and "iso-ir-100" is included in the supported encoding list.

Are you talking about the QTextCodec api docs? I actually expect Qt to
know about the common aliases for a given codec and automatically do the
right thing and would file a bugreport with QtSoftware if for example
"ISO-8859-1" or "latin1" wouldn't work (but "ISO 8859-1"). See:

#include <QtCore>

int main(int argc, char** argv)
{
    QTextCodec* latin1 = QTextCodec::codecForName("ISO-8859-1");
    QTextCodec* latin1_2 = QTextCodec::codecForName("ISO 8859-1");
    QTextCodec* latin1_3 = QTextCodec::codecForName("latin1");
    qDebug() << latin1->name() << latin1_2->name() << latin1_3->name();
    QTextCodec* invalid =
QTextCodec::codecForName("MyFavouriteInvalidCodec");
    qDebug() << invalid;
}

> Second, what does writeStartDocument("1.0"); mean? It will write out an 
> processing instruction (PI) line. Then, if you call 
> writeProcessingInstruction(), of course, you will get two PI lines.
> You shouldn't call writeStartDocument() if you have 
> writeProcessingInstruction() does the job.

Which is not what the examples show as good coding style nor how the
class is documented. IMHO you're misusing the API and you're relying on
the fact that writeEndDocument doesn't do any important cleanups or
similar in the actual implementation. If Qt ever changes its
implementation such that you have to use writeEndDocument, but also need
writeStartDocument then you're screwed.

> Third, it seems the method fromUtf8() should be removed while writing an 
> element content.

No it shouldn't, because my source file was stored as UTF-8 (I just
realize that this might have been changed in the mail) and hence to get
the proper String out of the literal I had to use fromUtf8.

Andreas

-- 
Don't tell any big lies today.  Small ones can be just as effective.



More information about the Qt-interest-old mailing list