[Development] HEADS-UP: QStringLiteral

Thu Aug 22 14:31:44 CEST 2019

On 21 Aug 2019, at 17:55, Thiago Macieira <thiago.macieira at intel.com> wrote:
>> That's why we are not removing QLatin1String: the Latin1 algorithm is
>> as fast as memcpy. The only thing better than that is zero copies.

Lars Knoll (22 August 2019 13:42) replied:
> We could also turn this around: Are we over-optimising here? Do we
> have the right balance between ease of use and performance? Converting
> utf8 is a bit more costly than latin1, but would that ever matter in
> real world use cases?

I guess that's a matter of how much of a performance penalty you're
willing to pay.

If you look at qtbase/src/corelib/codecs/qutfcodec.cpp you'll see it's
all very non-trivial - and makes heavy use of functions that are heavily
optimised to exploit particular instruction sets to squeeze all the
boosts they can into the performance.  It's highly complex and I won't
pretend to understand it.  That's the UTF8 path Thiago is talking about.
There is no short-cut, although I do wonder why there isn't a "search
for the first byte whose top bit is set", which might equip us with one.

The QLatin1String path is essentially qt_from_latin1() in qstring.cpp;
50 lines of code, vs several hundred; it does have some fancy
optimisations to exploit special CPU instructions; but one of its #if
paths really is just

    while (size--)
        *dst++ = (uchar)*str++;

i.e., as Thiago said, essentially (except for *dst being bigger than
*str) memcpy().  The other code paths are ways to optimise that.

The latter is obviously way cheaper than the former.  Is that a price
you're willing to pay, throughout all our library code, for ease of
writing library code ?

	Eddy.