[Qt-interest] Why QXmlStreamReader reports XML content in a broken way?

Ben Bridgwater bbridgwater at gmail.com
Fri Jan 2 16:26:08 CET 2009


I think it's because "escaped characters" are really considered in the 
XML specification as "entity references". The entities "quot", "apos" 
etc are predefined in the specification, but you can also define your 
own using declarations like <!ENTITY myentity "Replacement text">.

Per the QXmlSimpleReader documentation the behavior you want should be 
the default, but you could try setting it explicity via:

QXmlSimpleReader::setFeature("http://trolltech.com/xml/features/report-start-end-entity", 
false)

Ben

Lingfa Yang wrote:
> QXmlStreamReader users:
> 
> A normal XML element supposed to be reported as three Token: 
> StartElement, Characters, and EndElement.
> But in this element:
> <p>He said: &quot;I&apos;ll come again.&quot;</p>
> the content is reported four times:
> 1: "He said: "
> 2: ""I"
> 3: "'ll come again."
> 4: """
> 
> Does anyone know why QXmlStreamReader reports four times, instead of 
> one: “He said: "I'll come again."" ?
> Does this design benefit, or make escaped characters detected easier?
> 
> Thanks,
> Lingfa



More information about the Qt-interest-old mailing list