[Development] Rvalue pinning in Qt

Giuseppe D'Angelo giuseppe.dangelo at kdab.com
Mon Jun 20 16:46:57 CEST 2022


In the discussion for QTBUG-103940 it has been pointed out that we did 
not have a formal discussion regarding rvalue pinning in Qt facilities. 
We are actually inconsistent in its support, as we do use it in some 
places and do not use it elsewhere.

> https://bugreports.qt.io/browse/QTBUG-103940

== Rvalue what? ==

A fancy name for: "if a function/class is operating on a rvalue, should 
it store a copy of it in order to keep it alive?". Consider

>   auto tokenizer = QStringTokenizer(getString(), u"X");

If QStringTokenizer merely stores a reference to the input string, then 
the above would immediately dangle. But if it always stores a copy, then 
this code becomes unnecessarily expensive:

>   std::u16string someLongString = u"...";
>   auto tokenizer = QStringTokenizer(someLongString, u"X");

For this reason QStringTokenizer moves and stores the input if it's an 
rvalue, but only keeps a reference if it's an lvalue.

== The inconsistency ==

As far as I know, the only two places where we perform rvalue pinning 
are QStringTokenizer (6.0) and QMap/QHash::asKeyValueRange() (6.4). We 
don't do it elsewhere -- and we actually discourage users from relying 
on it. [1]

Case in point, the BR linked above, where one sees that 
QRegularExpression is NOT pinning:

>   QRegularExpression re;
>   QRegularExpressionMatch m;
>   if (getString().contains(re, &m))
>     use(m); // kaboom, `m` uses data of the destroyed temporary

== Should we do rvalue pinning in all cases? ==

While I was skeptical at first, I'm now more and more convinced that we 
should always pin rvalue Qt containers (incl. strings).

There is a name for this in C++: COW Qt containers are, in fact, _views_ 
[2][3][4]. QStringTokenizer, QRegularExpression etc. are also 
"conceptually" views (neither it is right now, technically speaking).

Now, when you pass a "source" view into another view, the latter takes a 
copy/move of the former. That's OK, because copying/moving views is 
cheap by contract. This is described by the `views::all` protocol [5][6] 
that views are supposed to use.

I'm therefore proposing to universally expand our pinning, following the 
  model of views.

For QStringTokenizer it wouldn't change much, as it already does rvalue 
pinning. If one extends it to the entirety of the views::all protocol, 
then it means that for instance this code would become well formed 
(currently it dangles):

>   auto getTokenizer() {
>      QString s = getString();
>      auto tok = QStringTokenizer(s, u"x");
>      return tok;
>   }
>   auto tok = getTokenizer();
>   for (const auto &token : tok) { ... }

But note that you can just `std::move(s)` into the tokenizer right now.

On the other hand it would also mean that

>   QString s = getString();
>   auto tok = QStringTokenizer(s, u"x");
would take a copy (given `s` is a view after [4]), while now it only 
takes a reference.

For QRegularExpression the situation is a tad more complex, because QRE 
is mostly out of line, and its API takes QString+QStringView. This means 
we cannot pin arbitrary rvalues string-like objects. Tony tables style:

> ╔════════════════════════════════════════╦══════════════════════════════════════════════════════════════════════════════════════╦═══════════════════════════╗
> ║                  Code                  ║                                         Now                                          ║           After           ║
> ╠════════════════════════════════════════╬══════════════════════════════════════════════════════════════════════════════════════╬═══════════════════════════╣
> ║ auto m1 = re.match(someQString);       ║ copies the string                                                                    ║ same                      ║
> ║ auto m2 = re.match(getQString());      ║ copies the string (no rvalue overload), but it's documented to be a "bad idea"in [1] ║ same, but fully supported ║
> ║ auto m3 = re.match(someQStringView);   ║ copies the string view                                                               ║ same                      ║
> ║ auto m4 = re.match(getQStringView());  ║ copies the string view                                                               ║ same                      ║
> ║ auto m5 = re.match(stdu16string);      ║ OK, through implicit conversion to QStringView                                       ║ same                      ║
> ║ auto m6 = re.match(getStdU16String()); ║ RUNTIME ERROR (dangles)                                                              ║ same                      ║
> ╚════════════════════════════════════════╩══════════════════════════════════════════════════════════════════════════════════════╩═══════════════════════════╝

We would however have the bug report scenario not dangle:

>   QRegularExpression re;
>   QRegularExpressionMatch m;
>   if (getString().contains(re, &m))
>     use(m); // ok, `m` is keeping a copy/move of the temporary alive

Implementation wise, it's pretty much straightforward given the code is 
already there. It's mostly going to be a documentation issue -- removing 
the notes that say that the above is unsupported.


[1] https://doc-snapshots.qt.io/qt6-dev/qregularexpression.html#match
[2] https://eel.is/c++draft/range.view#2
[3] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2415r2.html
[4] https://codereview.qt-project.org/c/qt/qtbase/+/407325
[5] https://eel.is/c++draft/range.all#general-2
[6] https://en.cppreference.com/w/cpp/ranges/all_view

Giuseppe D'Angelo | giuseppe.dangelo at kdab.com | Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com
KDAB - The Qt, C++ and OpenGL Experts
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4244 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.qt-project.org/pipermail/development/attachments/20220620/1e47905a/attachment.bin>

More information about the Development mailing list