[Qt-interest] Why QXmlStreamReader reports XML content in a broken way?
Lingfa Yang
lingfa at brandeis.edu
Fri Jan 2 17:26:53 CET 2009
Thank you for your answer. My question is about QXmlStreamReader, not
<qxmlsimplereader.html>QXmlSimpleReader.
QXmlStreamReader is a faster and more convenient replacement of
<qxmlsimplereader.html>QXmlSimpleReader, which was introduced in Qt 4.3.
I, meanwhile, checked another parser: Xerces
(http://en.wikipedia.org/wiki/Xerces) where element content does not
break down by "entities" - this raises my curiosity why
QXmlStreamReader does. I am confused this is a defect or betterment?
Thanks,
Lingfa
Ben Bridgwater wrote:
> I think it's because "escaped characters" are really considered in the
> XML specification as "entity references". The entities "quot", "apos"
> etc are predefined in the specification, but you can also define your
> own using declarations like <!ENTITY myentity "Replacement text">.
>
> Per the QXmlSimpleReader documentation the behavior you want should be
> the default, but you could try setting it explicity via:
>
> QXmlSimpleReader::setFeature("http://trolltech.com/xml/features/report-start-end-entity",
> false)
>
> Ben
>
> Lingfa Yang wrote:
>
>> QXmlStreamReader users:
>>
>> A normal XML element supposed to be reported as three Token:
>> StartElement, Characters, and EndElement.
>> But in this element:
>> <p>He said: "I'll come again."</p>
>> the content is reported four times:
>> 1: "He said: "
>> 2: ""I"
>> 3: "'ll come again."
>> 4: """
>>
>> Does anyone know why QXmlStreamReader reports four times, instead of
>> one: “He said: "I'll come again."" ?
>> Does this design benefit, or make escaped characters detected easier?
>>
>> Thanks,
>> Lingfa
>>
> _______________________________________________
> Qt-interest mailing list
> Qt-interest at trolltech.com
> http://lists.trolltech.com/mailman/listinfo/qt-interest
>
More information about the Qt-interest-old
mailing list