[Qt-interest] How to compress QString
Thiago Macieira
thiago.macieira at trolltech.com
Thu Apr 16 17:26:43 CEST 2009
Em Quinta-feira 16 Abril 2009, às 16:58:51, Jan Kundrát escreveu:
> Oliver.Knoll at comit.ch wrote:
> > How is UTF-8 more stable and better documented than UTF-16? I thought
> > the range of Unicode characters is pretty well-defined? Or am I
> > missing something here?
>
> By the time I wrote that mail, I wasn't aware of the fact that Qt
> declares that it uses UTF-16 for QStrings internally. If there was no
> such guarantee, you'd have problems when touching QString's char*
> returned by data().
QString::data() doesn't return char*. It returns QChar*.
Remember: QString is an array of QChar, each of which is an UTF-16 entry. If
you treated QString::data() as char*, you'd probably run into problems due to
there being a 0 byte every other byte (ASCII and Latin 1 are less than 0x100,
so the high byte is 0).
> However, I'm not sure if the size of the str.utf16() is really
> str.size() * sizeof(ushort) -- what happens if there are surrogate pairs
> in the string? Will they be reflected in QString's size()?
It's UTF-16, not UCS-4 or some other weird encoding that treats codepoints
above U+FFFF differently.
That means surrogate pairs may appear in the string, in their correct order.
That means QString::length() (and size() and count()) return the number of
UTF-16 characters/words, not the number of Unicode codepoints. Also note that
the number of codepoints is also different from the string's width, even in a
monospace fonts (there are codepoints with zero width, normal width or double
width).
--
Thiago Macieira - thiago.macieira (AT) nokia.com
Senior Product Manager - Nokia, Qt Software
Sandakerveien 116, NO-0402 Oslo, Norway
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
Url : http://lists.qt-project.org/pipermail/qt-interest-old/attachments/20090416/e2044ab7/attachment.bin
More information about the Qt-interest-old
mailing list