[Qt-interest] Parsing a HTML document with non-supported htmlsubsets
Castagne Nicolas
nicolascastagne at yahoo.fr
Wed Jul 22 09:55:04 CEST 2009
Hello Tony,
It can, through export. But I am working on copy/paste features.
Indeed, excel copy fonction does not load XHTML in the clipboard :(
Excel copy feature only put text/html and text/plain mime types in the clipboard.
And only the text/html encodes the "full value" in a x:num attribute of the <td> tags.
I enclose the content of a text/html copied from excel in an attached document.
Well, anyway, I 'll manage things with regexp parsing. But that's not best.
This is the third time I come to think that Qt would benefit from having an HTML parser :/
Best-
Nicolas
--- En date de : Mer 22.7.09, Tony Rietwyk <tony.rietwyk at rightsoft.com.au> a écrit :
De: Tony Rietwyk <tony.rietwyk at rightsoft.com.au>
Objet: RE: [Qt-interest] Parsing a HTML document with non-supported htmlsubsets
À: "'Castagne Nicolas'" <nicolascastagne at yahoo.fr>
Date: Mercredi 22 Juillet 2009, 2h33
Message
Hi
Nicolas,
Can
Excel output XHTML, rather than HTML?
Regards,
-----Original Message-----
From:
qt-interest-bounces at trolltech.com [mailto:qt-interest-bounces at trolltech.com]
On Behalf Of Castagne Nicolas
Sent: Tuesday, 21 July 2009
22:04
To: Qt-interest
Subject: Re: [Qt-interest] Parsing a
HTML document with non-supported htmlsubsets
> This may not be the best solution, but I guess you
could try to parse your
> HTML file manually as an XML
document.
Thanks Constantin,
I have tried that, but the
HTML string generated by Excel is not a valid XML file :/
The
html string has, for example, something like:
____
<body
link="#0000d4" vlink="#993366">
<table border=0 cellpadding=0
cellspacing=0 width=190
style='border-collapse:
collapse'>
___
and
QDomDocument::setContent complains that :
"unexpected character" on
line 3 and column 15
I guess that is due to the fact that
the attribute (eg border) values are not
comma-encapsulated.
Yeah, i know, Excel is bad :)
Any
other hints welcome. I'll try parsing with reg
exp.
Best-
Nicolas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.qt-project.org/pipermail/qt-interest-old/attachments/20090722/9c52d5c6/attachment.html
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ExcelClipboardContent_text_html.txt
Url: http://lists.qt-project.org/pipermail/qt-interest-old/attachments/20090722/9c52d5c6/attachment.txt
More information about the Qt-interest-old
mailing list