[Qt-interest] Does Qt support Unicode 5.1?

Thiago Macieira thiago.macieira at trolltech.com
Fri Mar 6 02:20:26 CET 2009


Constantin Makshin wrote:
>static QChar HighSurrogate (unsigned c)
>{
>     return QChar(((c - 0x10000) >> 10 & 0x3ff) + 0xd800);
>}
>
>static QChar LowSurrogate (unsigned c)
>{
>     return QChar((c & 0x3ff) + 0xdc00);
>}
>
>To Qt developers: Your surrogate characters handling code seems to be  
>wrong

Would you care to elaborate? What's wrong with QChar::lowSurrogate and 
QChar::highSurrogate?

    static inline ushort highSurrogate(uint ucs4) {
        return (ucs4>>10) + 0xd7c0;
    }
    static inline ushort lowSurrogate(uint ucs4) {
        return ucs4%0x400 + 0xdc00;
    }

They're written differently, but it's the same math. The low surrogate is 
easy to spot (modulus 0x400 and bitwise-and 0x3ff are the same operation).

As for the high surrogate, instead of subtracting 0x10000 before the 
right-shift, we subtracted it after. See the math below. Note that the 
bitwise-and for 0x3ff is unnecessary since Unicode is limited to 0x10FFFF 
anyways.

	((x - 0x10000) >> 10) + 0xd800
	(x >> 10) - (0x10000 >> 10) + 0xd800
	(x >> 10) - 0x40 + 0xd800
	(x >> 10) + 0xd7c0

In fact, a good compiler should realise that and optimise the extra 
operation away. We're just on the safe-side and doing that work for the 
compiler.

-- 
Thiago Macieira - thiago.macieira (AT) nokia.com
  Senior Product Manager - Nokia, Qt Software
      Sandakerveien 116, NO-0402 Oslo, Norway
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://lists.qt-project.org/pipermail/qt-interest-old/attachments/20090305/791ddb2f/attachment.bin 


More information about the Qt-interest-old mailing list