[Qt-interest] Html parsing

Benjamin Lau blwy10v at gmail.com
Mon Dec 1 14:57:25 CET 2008


You can try this (read: I haven't tried it):

1. Load the page into a QWebFrame
2. Call addToJavaScriptWindowObject with a (read: write it yourself)
custom moc-ed object that has slots like "startTag" and "endTag" etc.
(look at something like QXmlStreamWriter)
3. Write a JavaScript function that uses the browser DOM to cycle
through the nodes in the page and call these functions on the object
you added
4. Insert that function source and JavaScript code to call that
function via evaluateJavaScript
5. After evaluateJavaScript returns, the custom moc-ed object with the
slots, if implemented properly, will have a record of the entire DOM
of the web page.

I know the details are sketchy, but I don't know how to describe this
without resorting to code, but I don't have the code with me. When I
can get my hands on them, I'll post it up (it hasn't been tested yet,
so it mak take a while).

Hope this helps!
Ben



More information about the Qt-interest-old mailing list