[Development] HEADS-UP: QStringLiteral
edward.welbourne at qt.io
Thu Aug 22 14:31:44 CEST 2019
On 21 Aug 2019, at 17:55, Thiago Macieira <thiago.macieira at intel.com> wrote:
>> That's why we are not removing QLatin1String: the Latin1 algorithm is
>> as fast as memcpy. The only thing better than that is zero copies.
Lars Knoll (22 August 2019 13:42) replied:
> We could also turn this around: Are we over-optimising here? Do we
> have the right balance between ease of use and performance? Converting
> utf8 is a bit more costly than latin1, but would that ever matter in
> real world use cases?
I guess that's a matter of how much of a performance penalty you're
willing to pay.
If you look at qtbase/src/corelib/codecs/qutfcodec.cpp you'll see it's
all very non-trivial - and makes heavy use of functions that are heavily
optimised to exploit particular instruction sets to squeeze all the
boosts they can into the performance. It's highly complex and I won't
pretend to understand it. That's the UTF8 path Thiago is talking about.
There is no short-cut, although I do wonder why there isn't a "search
for the first byte whose top bit is set", which might equip us with one.
The QLatin1String path is essentially qt_from_latin1() in qstring.cpp;
50 lines of code, vs several hundred; it does have some fancy
optimisations to exploit special CPU instructions; but one of its #if
paths really is just
*dst++ = (uchar)*str++;
i.e., as Thiago said, essentially (except for *dst being bigger than
*str) memcpy(). The other code paths are ways to optimise that.
The latter is obviously way cheaper than the former. Is that a price
you're willing to pay, throughout all our library code, for ease of
writing library code ?
More information about the Development