[Development] Why can't QString use UTF-8 internally?

Konstantin Ritt ritt.ks at gmail.com
Tue Feb 10 22:52:34 CET 2015


2015-02-11 1:26 GMT+04:00 Thiago Macieira <thiago.macieira at intel.com>:

> On Wednesday 11 February 2015 00:37:41 Konstantin Ritt wrote:
> > Yes, that would be an ideal solution. Unfortunately, that would also
> break
> > a LOT of existing code.
> > In Qt4 times, I was doing some experiments with the QString adaptive
> > storage (similar to what NSString does behind the scenes).
>
> I've thought of this too.
>
> This stumbles on QString's implicit sharing. If you do this:
>
>         QString foo = "some UTF-8 text";
>         QString copy = foo;
>         qDebug() << foo.constData()[0];
>

In my experiments (a QString with an adaptive storage), data()/constData()
returns uchar*; `qDebug() << foo.constData()[0]` could be i.e. `qDebug() <<
foo.utf8Data()[0]`.
There are even more invasive changes to the behavior: the indexes are
UTF-32 codepoint positions, length() is the amount of UTF-32 codepoints,
etc.
I believe we could spare some performance in QString write operations for
the sake of convenience and boosting the QString read operations. WDYT?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20150211/6b0b53ea/attachment.html>


More information about the Development mailing list