[Interest] How to add tidy using Qt?

程梁 devbean at outlook.com
Thu Dec 27 08:38:50 CET 2012


Noop, I've done nothing, just code for testing. So they are all in the same function:

const char* html = d->visualView->toFormattedHtml().toUtf8().constData();
TidyBuffer output;

TidyDoc tdoc = tidyCreate();


tidyBufInit(&output);Bool ok = tidyOptSetBool(tdoc, TidyXmlOut, yes);
int rc = -1;
// config tidy
if ( ok ) ok = tidyOptSetBool( tdoc, TidyXmlDecl, no );
if ( ok ) ok = tidyOptSetBool( tdoc, TidyDropPropAttrs, yes );
if ( ok ) ok = tidyOptSetBool( tdoc, TidyMakeBare, yes );
if ( ok ) ok = tidyOptSetValue( tdoc, TidyBodyOnly, "yes" );
if ( ok ) ok = tidyOptSetBool( tdoc, TidyDropFontTags, yes );
if ( ok ) ok = tidyOptSetBool( tdoc, TidyFixComments, yes );
if ( ok ) ok = tidyOptSetBool( tdoc, TidyEscapeCdata, yes );
if ( ok ) ok = tidyOptSetBool( tdoc, TidyJoinStyles, yes );
if ( ok ) ok = tidyOptSetBool( tdoc, TidyEscapeCdata, yes);
if ( ok ) ok = tidyOptSetBool( tdoc, TidyHideComments, yes);
if ( ok ) ok = tidyOptSetBool( tdoc, TidyForceOutput, yes);
tidySetCharEncoding(tdoc, "utf8");
// start parsing
if (rc >= 0) {
    rc = tidyParseString(tdoc, html);
}
if (rc >= 0) {
    rc = tidyCleanAndRepair(tdoc);
}

if (rc >= 0) {
    rc = tidySaveBuffer(tdoc, &output);
}

if (rc >= 0) {
    printf("\nAnd here is the result:\n\n%s", output.bp);

    QByteArray ret = QByteArray(reinterpret_cast<char*>(output.bp),output.size);
} else {
    printf("A severe error (%d) occurred.\n", rc);
}

tidyBufFree(&output);
tidyRelease(tdoc);

This is all code I use. The output is:

<html>
<head>
<meta name="generator"
content="HTML Tidy for Windows (vers 25 March 2009), see www.w3.org" />
<title></title>
</head>
<body>��@- at H a�S�������Ýï¿ï¿½ï¿½</body>
</html>
All tages generating by QWebView are correct but text I edited in wev view(using page()->setContentEditable(true);) is incorrect. So I think this is because the encoding is wrong. But I do use toUtf8() to get char *(toUtf8().constData()).


Cheng Liang
Nanjing, China
http://www.devbean.info

From: tony at rightsoft.com.au
To: interest at qt-project.org
Date: Thu, 27 Dec 2012 18:15:49 +1100
Subject: Re: [Interest] How to add tidy using Qt?

Hi Cheng,  You didn't show how the html variable was used?  Are you sure that the html pointer is valid?   I suggest:   QByteArray ba = webView->page()->mainFrame()->toHtml().toUtf8();const char *html = ba.constData();... In your original code, the temporary string and byte array were being destroyed after that statement, causing a dangling pointer in html.  Even if it looked valid immediately after, subsequent allocations would have made the area appear corrupt.  By keeping a reference to the temporary, you ensure that it will be valid for the whole block.   Tony.  From: interest-bounces+tony=rightsoft.com.au at qt-project.org [mailto:interest-bounces+tony=rightsoft.com.au at qt-project.org] On Behalf Of ??
Sent: Thursday, 27 December 2012 4:52 PM
To: Qt Interest
Subject: [Interest] How to add tidy using Qt? Hi, there! I want to add tidy as library into Qt program where I use some code as following:

const char* html = webView->page()->mainFrame()->toHtml().toUtf8().constData();
// ...
TidyBuffer output;
Bool ok = tidyOptSetBool(tdoc, TidyXmlOut, yes);
tidySetCharEncoding(tdoc, "utf8");
tidySaveBuffer(tdoc, &output);
QByteArray ret = QByteArray(reinterpret_cast<char*>(output.bp),output.size);

But I got warning: 

UTF-8 decoding error of 1 bytes : 0xff = U+0255lx UTF-8 decoding error of 1 bytes : 0xff = U+0255lxUTF-8 decoding error of 1 bytes : 0xe8 = U+0008lxUTF-8 decoding error of 1 bytes : 0xe8 = U+0008lxUTF-8 decoding error of 1 bytes : 0xfd = U+0001lxUTF-8 decoding error of 1 bytes : 0xfd = U+0001lxUTF-8 decoding error of 1 bytes : 0xfd = U+0001lxUTF-8 decoding error of 1 bytes : 0xfd = U+0001lx
I've no idea how this happen. Could you help me? Thank you!

Cheng Liang
Nanjing, China
http://www.devbean.info 
_______________________________________________
Interest mailing list
Interest at qt-project.org
http://lists.qt-project.org/mailman/listinfo/interest 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/interest/attachments/20121227/82119d50/attachment.html>


More information about the Interest mailing list