[Development] HEADS-UP: QStringLiteral

Thu Aug 22 14:48:36 CEST 2019

Now that we use clang in Creator and QDoc, suppose we write a QString source code analysis tool using clang. The tool would parse sources looking for uses of QString and then analyzing the code patterns where QString is used to find possible optimizations using the other Qt string classes.

The output is a report suggesting the possible optimizations. Then we can tell customers to write code with QString first, because that's easy. When it works, run this tool and see where you can improve performance and which string classes to use.

We could do the same for QList.

martin

________________________________________
From: Development <development-bounces at qt-project.org> on behalf of Edward Welbourne <edward.welbourne at qt.io>
Sent: Thursday, August 22, 2019 2:31 PM
To: Lars Knoll
Cc: Thiago Macieira; Qt development mailing list
Subject: Re: [Development] HEADS-UP: QStringLiteral

On 21 Aug 2019, at 17:55, Thiago Macieira <thiago.macieira at intel.com> wrote:
>> That's why we are not removing QLatin1String: the Latin1 algorithm is
>> as fast as memcpy. The only thing better than that is zero copies.

Lars Knoll (22 August 2019 13:42) replied:
> We could also turn this around: Are we over-optimising here? Do we
> have the right balance between ease of use and performance? Converting
> utf8 is a bit more costly than latin1, but would that ever matter in
> real world use cases?

I guess that's a matter of how much of a performance penalty you're
willing to pay.

If you look at qtbase/src/corelib/codecs/qutfcodec.cpp you'll see it's
all very non-trivial - and makes heavy use of functions that are heavily
optimised to exploit particular instruction sets to squeeze all the
boosts they can into the performance.  It's highly complex and I won't
pretend to understand it.  That's the UTF8 path Thiago is talking about.
There is no short-cut, although I do wonder why there isn't a "search
for the first byte whose top bit is set", which might equip us with one.

The QLatin1String path is essentially qt_from_latin1() in qstring.cpp;
50 lines of code, vs several hundred; it does have some fancy
optimisations to exploit special CPU instructions; but one of its #if
paths really is just

    while (size--)
        *dst++ = (uchar)*str++;

i.e., as Thiago said, essentially (except for *dst being bigger than
*str) memcpy().  The other code paths are ways to optimise that.

The latter is obviously way cheaper than the former.  Is that a price
you're willing to pay, throughout all our library code, for ease of
writing library code ?

        Eddy.
_______________________________________________
Development mailing list
Development at qt-project.org
https://lists.qt-project.org/listinfo/development