[Development] [Question] Implementation of XML character validation

Kurt Pattyn pattyn.kurt at gmail.com
Sun Sep 8 22:39:07 CEST 2013


On 08 Sep 2013, at 20:43, Thiago Macieira <thiago.macieira at intel.com> wrote:

> On domingo, 8 de setembro de 2013 20:36:39, Kurt Pattyn wrote:
>> bool QXmlUtils::isChar(const QChar c)
>> {
>>    return (c.unicode() >= 0x0020 && c.unicode() <= 0xD7FF)
>>           || c.unicode() == 0x0009
>>           || c.unicode() == 0x000A
>>           || c.unicode() == 0x000D
>>           || (c.unicode() >= 0xE000 && c.unicode() <= 0xFFFD);
>> }
>> Isn't this code missing the check 
>> c >= 0x10000 && c <= QChar::LastValidCodePoint ?
> 
> No.
> 
> It's limited by the size of QChar. It cannot contain 0x10000.
> 
> No, the entire API is flawed. It should work on terms of UCS-4, not of QChar. 

Couldn't it be a solution to expand QChar to contain 32-bit code points iso 16-bit, and have the unicode() function return an UCS4 value?

At least, I think it would be nice that the checks for valid XML characters would be concentrated in one place.

> The code calling this API needs to do the surrogate decoding. This class may 
> be interesting for them:
> 
> https://codereview.qt-project.org/669
> 
> -- 
> Thiago Macieira - thiago.macieira (AT) intel.com
>  Software Architect - Intel Open Source Technology Center

Kurt


More information about the Development mailing list