[Qt-interest] How to parse a paragraph twice with an XML stream parser?
Lingfa Yang
lingfa at brandeis.edu
Tue May 12 16:34:38 CEST 2009
Thank you for your reply. Your scheme, cache content, is feasible.
I can use QXmlStreamReader grab the paragraph content and save as a
QString helped by QXmlStreamWriter. Then, create a new QXmlStreamReader
to read the string.
Problem is efficiency. 1) When read by QXmlStreamReader, it does not
read raw data. For example, if I get '<' or '>' characters, they are
escaped from < and > in XML file, and writing content need escape
again, which sounds extra cost. 2) Reading by QXmlStreamReader, under
scene, is a tokenizing process. Every time, I create a parser to read,
it does tokenizing again. Is it possible to cache tokens and read cached
tokens?
Lingfa
Paul Miller wrote:
> Lingfa Yang wrote:
>> Hi XML experts or XML application developers,
>>
>> I am using QXmlStreamReader parsing document.xml. The file can be
>> huge, so I prefer this parser better than DOM.
>>
>> The document.xml contains many paragraphs. With each paragraph I have
>> to parse it twice. I wish QXmlStreamReader can remember start element
>> of each p (Paragraph) tag. When finish the first time reading, reset
>> to the p tag again and read second time. It seems QXmlStreamReader
>> cannot do that.
>
> The stream reader simply can't do that - remember a "stream" could be
> a sequence of characters sent over a wire (like the Internet) and it's
> just cached anywhere.
>
> What you should probably do is cache the contents of the element
> yourself. When you encounter one of these that you need to "read"
> twice, store the contents and related attributes in your own cache,
> then when you need to "read" it again, just read from your copy.
>
More information about the Qt-interest-old
mailing list