[Development] How qAsConst and qExchange lead to qNN

Thu Nov 10 23:41:34 CET 2022

[

TL;DR :

Qt API should use

   QString foo() const;
   void setFoo(const QString &);

Neither QStringView nor C++20 changes this.

]

On Wed, Nov 09, 2022 at 10:52:15AM +0000, Marc Mutz via Development wrote:
> Hi Ulf,
> 
> On 09.11.22 08:18, Ulf Hermann via Development wrote:
> >> I don't want to take the Qt containers away from Qt users. I want to get
> >> rid of their use in our APIs, so that both the Qt implementation as well
> >> as our users are free to choose the best container for their needs
> >> instead of having to pick from one of the public Qt containers.
> >
> > I would like to know how that is supposed to work in practice. We have a 
> > lot of public API dealing with Qt containers all over. What are you 
> > going to do to, for example, to
> > 
> >      void addActions(const QList<QAction*> &actions);
> > 
> > in qwidget.h? What should it look like when we're done and our users are 
> > free to choose the best container for their needs?
> 
> In this particular case, the API would be
> 
>     void addActions(QSpan<QAction*> actions);
> 
> This is the trivial case: setters for contiguous data.

This might be trivial to implement, but the result is not obviously
better.

In case the actions would directly be stored in a QList<QAction *>
this suddenly causes a full deep copy of the list.

Even if we would apply this pattern only to cases where the contents
not stored as such, this introduces several problems:

 (a) we get inconsistent API, function signatures differ without
     obvious user-visible reason,
 (b) the signatures depend on the implemenations, i.e. details
     of the implementation now leak into the API,
 (c) the implementations cannot change freely anymore as 
     "necessary" changes to the signatures cannot be done due
      to the API
 (d) is gets spooky when a function takes several string args
     and needs a copy of some, but not of all

> It's backwards-compatible, because QList implicitly converts to QSpan (and 
> std::span), but now, as a user, I can also pass a C array or std::vector 
> or QVLA of QAction*:
> 
>     QAction *actions[] = {
>        new FooAction(this),
>        new BarAction(this),
>     };
>     addActions(actions);

Qt applications are using, well, Qt. People are of course free to store
QActions in a std::vector when they dislike QList, but of the end of the
day I very much prefer if people with 

> This is the same effect we have with Q*StringView, incl. avoidance of 
> caller-side construction and destruction of owning containers.

It's the same effect, but the frequency of the need of a caller side
construction of the owning container depend on the caller. Most notably,
the caller code can even influence that by using the "right" container
to start with. The "problem" you expect here only appears for code that
decides to deviate from Qt style, as soon as you are consistent, it
"just works".

And that's actually the reason why I am opposing this kind of change.

As you may know, we are migrating the Qt Creator code base towards a
state where file names are not kept as QStrings but as a separate type,
kind of lightweight URL containing scheme, host and path only. The
structure internally has seen a few iterations now, first three, than
four QStrings, now one QString as common store and three integral
offsets to access the scheme/host/path (sub)strings when needed, as
this means only 8 byte overhead over a plain QString, instead of the
48 bytes for the first shot using trhee full strings.

I am not bound by the same kind of API promises like Qt there, and
since this a quite heavily used structure I am in principle fine
with even "weird" API if it helps e.g. with performance. As a
consequence I actually tried a few ideas to see what helps and what
not /in practice/.

The result is effectively the following:

1. QStringView comes in handy when accessing these substrings.
2. ... but that's effectivly only due to the special nature of
   the storage here.
3. Cutting down such pieces further, comparing etc in the 
   caller feels good, QStringView does the right thing. No copies etc.
4. ... and it doesn't look as ugly as .midRef() etc before.
5. Passing such chunks further on to local helper functions that
   are "implementation details" feels good.
6. That's were it stops.

Most notably, going further, like using QStringView when crossing module
boundaries, feels wrong, even when it technically would work in some
cases.

We have quite a few cases where operation are prepared but executed
delayed, e.g. to not block the gui thread, or not at all, e.g. actions
for context menus that are never triggered by the user. In these cases
the data needs to be copied as the original can vanish.

In other cases the data passes through several layers of truly expensive
but unavoidable operations, so optimizing away one copy is not
noticable. Think for instance what is actually needed when you want to
find out, whether there's an executable "/usr/bin/qmake" on some remote
device.

A similar situation happens in some parsers, where result ofter are
just substrings of some original string (e.g. some file's contents).
But then we typically don't want to keep the whole contents around,
i.e. need to deep-copy the substring, too.

My personal conclusion here is that use of non-owning "containers in
regular API (let alone API with long-term stability promises like Qt) is
/not/ advisable, in fact, I am ready to call this an anti-pattern, as it
/introduces/ problems like lifetime considerations of the underlying real 
container but brings in total no visible gain.

Two more remarks:

(A) The "'occasionally' needed deep copy problem" could be addressed
partially by using something like

    class QStringOrStringView : public QStringView
    {
      public:
        QStringOrStringView(const ...unreal)
           : QStringView(real), source(nullptr)
        {}

        QStringOrStringView(const QString &real)
           : QStringView(real), source(&real)
        {}

        QString toString() const
        {
            return source ? *source : QStringView::toString();
        }

        ....
      private:
        const QString *source = nullptr;
    };

This would reduce at least the impact on users of Qt conventions
but not help people who insist on deviations. I don't suggest using
this.

(B) I have, naturally, asked myself how come that I often end up with
different conlusions then "pure C++" gurus. I believe there are two
reasons, a technical and a non-technical one:
 - Qt containers are reference counted. Keeping the ref count takes
   space and time upfront, with benefits coming in or possibly not later,
   depending on what happens with it. Overall I believe it balances
   out, or is in favour of the counting, but /dropping/ the refcount
   always loses.
 - Qt containers are reference counted. This makes them less fragile
   then a "owning storage + views" combination distributed across
   different parts of the code. Non-gurus have a certain chance to
   fumble in this situation, and I prefer setups that make it hard
   for me to fumble.

Andre'

PS:

> This is NOIv1: https://www.youtube.com/watch?v=JUUID0_NvQI
> 
> We can also return spans:
> 
>     QSpan<QAction*> actions() const;
> 
> but this only works, though, if we're returning a) stored and b) 
> contiguous data. If we return computed or non-contiguous data, instead, 
> we can, however, use coroutines:
> 
>    std::g/QGenerator<QModelIndex> selectedIndexes() const;
> 
> This is NOIv2: https://www.youtube.com/watch?v=tvdwYwTyrig
> 
> The benefit here isn't only that the implementation data structure is 
> encapsulated, but also that, for computed values, the computation stops 
> when the consumer stops consuming.
> 
> The drawback is that the memory allocation of the coroutines' stack 
> frame is not elidable (crosses ABI boundaries), but when compared to 
> returning computed owning containers, you don't have more, and typically 
> have much less, memory allocations.

Seriously: In the normal course of normal activities I have a string,
and I want it to pass to a function. I have this "problem" often. Now I
do /not/ want to think about coroutines, elidable stack frames, ABI
boundaries and such all the time, I just want to call the function. Now.
And I simply do that. And I regularly can do that without bad effects on
the rest of my life. This is exactly the kind of convenience that Qt
provides, and that is not what I get with "pure" C++. That's also one of
the reasons I use Qt.

Now at some time I may notice that in some particular situation this was
not appropriate, because the result is slow. This actually happens
occasionally. Inner loops and such. But this is rare. Then I fix /this/
problem and I am happy. I am happy not because I had the problem, or
because I fixed it, but because this did not happen for a long time
before, and during all this long time I did not have to worry about
coroutines, elidable stack frames and ABI boundaries and I just could
call functions.