[Qt-interest] Html parsing

Sean Harmer sean.harmer at maps-technology.com
Mon Dec 1 16:21:44 CET 2008


Hi,

On Monday 01 December 2008 14:31:40 Frédéric LECONTE wrote:
> > > I tried to parse it with QXmlStreamReader but like I thought my web
> > > page is not valid xml( not a XHTML page. for information the web page I
> > > try to parse is translate.google.fr).
> >
> > One method is to run the html document through something that can convert
> > it to a valid XML document such as htmltidy.
>
> I don't want to valid my web page, I don't need to re-use it
The point is that if it is then valid (read well formed) XML then you can 
parse the document easily with one of the several XML apis provided by Qt to 
get the info that you need. That is it gets round the problem that you have 
encountered.

> > > Can I use QWebPage, not display it on screen and find a way to get
> > > certain fields ?
> >
> > You can access elements via javascript which has been discussed many
> > times on the list.
>
> It sounds like a complicated way to do a simple thing...
> No simpliest code ?
It is not difficult, just search the archives and take a look at:

http://doc.trolltech.com/4.4/qwebframe.html#evaluateJavaScript

> PS: I'm newcomer.
Welcome! :-)

Sean




More information about the Qt-interest-old mailing list