[Development] [Question] Implementation of XML character validation

Giuseppe D'Angelo dangelog at gmail.com
Sun Sep 8 23:22:32 CEST 2013


Hi,

please, let's keep the discussion on the ML.

On 8 September 2013 23:10, Kurt Pattyn <pattyn.kurt at gmail.com> wrote:
> Hi Giuseppe,
>
> this is not mentioned in the documentation, and, if QChar is following the Unicode v6.2 standard, cannot be correct, as the method unicode() returns a 16-bit value, which even in UTF-16 is too short (UTF-16 encoding can have 2 16-bit values to represent a unicode character).

Yes, unicode() returns the UTF-16 code unit held inside the QChar
(which is just a 16 bit number...). If you want to UTF-16 encode code
points above 0xFFFF, you need surrogate pairs, i.e. pairs of QChars.
QChar itself offers methods such as isHighSurrogate/isLowSurrogate,
and various statics (surrogateToUcs4(QChar, QChar), isSurrogate(uint),
lowSurrogate(uint), etc.)

HTH,

-- 
Giuseppe D'Angelo



More information about the Development mailing list