[Development] Rvalue pinning in Qt
Giuseppe D'Angelo
giuseppe.dangelo at kdab.com
Mon Jun 20 16:46:57 CEST 2022
Hello,
In the discussion for QTBUG-103940 it has been pointed out that we did
not have a formal discussion regarding rvalue pinning in Qt facilities.
We are actually inconsistent in its support, as we do use it in some
places and do not use it elsewhere.
> https://bugreports.qt.io/browse/QTBUG-103940
== Rvalue what? ==
A fancy name for: "if a function/class is operating on a rvalue, should
it store a copy of it in order to keep it alive?". Consider
> auto tokenizer = QStringTokenizer(getString(), u"X");
If QStringTokenizer merely stores a reference to the input string, then
the above would immediately dangle. But if it always stores a copy, then
this code becomes unnecessarily expensive:
> std::u16string someLongString = u"...";
> auto tokenizer = QStringTokenizer(someLongString, u"X");
For this reason QStringTokenizer moves and stores the input if it's an
rvalue, but only keeps a reference if it's an lvalue.
== The inconsistency ==
As far as I know, the only two places where we perform rvalue pinning
are QStringTokenizer (6.0) and QMap/QHash::asKeyValueRange() (6.4). We
don't do it elsewhere -- and we actually discourage users from relying
on it. [1]
Case in point, the BR linked above, where one sees that
QRegularExpression is NOT pinning:
> QRegularExpression re;
> QRegularExpressionMatch m;
> if (getString().contains(re, &m))
> use(m); // kaboom, `m` uses data of the destroyed temporary
== Should we do rvalue pinning in all cases? ==
While I was skeptical at first, I'm now more and more convinced that we
should always pin rvalue Qt containers (incl. strings).
There is a name for this in C++: COW Qt containers are, in fact, _views_
[2][3][4]. QStringTokenizer, QRegularExpression etc. are also
"conceptually" views (neither it is right now, technically speaking).
Now, when you pass a "source" view into another view, the latter takes a
copy/move of the former. That's OK, because copying/moving views is
cheap by contract. This is described by the `views::all` protocol [5][6]
that views are supposed to use.
I'm therefore proposing to universally expand our pinning, following the
model of views.
For QStringTokenizer it wouldn't change much, as it already does rvalue
pinning. If one extends it to the entirety of the views::all protocol,
then it means that for instance this code would become well formed
(currently it dangles):
> auto getTokenizer() {
> QString s = getString();
> auto tok = QStringTokenizer(s, u"x");
> return tok;
> }
>
> auto tok = getTokenizer();
> for (const auto &token : tok) { ... }
But note that you can just `std::move(s)` into the tokenizer right now.
On the other hand it would also mean that
> QString s = getString();
> auto tok = QStringTokenizer(s, u"x");
would take a copy (given `s` is a view after [4]), while now it only
takes a reference.
For QRegularExpression the situation is a tad more complex, because QRE
is mostly out of line, and its API takes QString+QStringView. This means
we cannot pin arbitrary rvalues string-like objects. Tony tables style:
> ╔════════════════════════════════════════╦══════════════════════════════════════════════════════════════════════════════════════╦═══════════════════════════╗
> ║ Code ║ Now ║ After ║
> ╠════════════════════════════════════════╬══════════════════════════════════════════════════════════════════════════════════════╬═══════════════════════════╣
> ║ auto m1 = re.match(someQString); ║ copies the string ║ same ║
> ║ auto m2 = re.match(getQString()); ║ copies the string (no rvalue overload), but it's documented to be a "bad idea"in [1] ║ same, but fully supported ║
> ║ auto m3 = re.match(someQStringView); ║ copies the string view ║ same ║
> ║ auto m4 = re.match(getQStringView()); ║ copies the string view ║ same ║
> ║ auto m5 = re.match(stdu16string); ║ OK, through implicit conversion to QStringView ║ same ║
> ║ auto m6 = re.match(getStdU16String()); ║ RUNTIME ERROR (dangles) ║ same ║
> ╚════════════════════════════════════════╩══════════════════════════════════════════════════════════════════════════════════════╩═══════════════════════════╝
We would however have the bug report scenario not dangle:
> QRegularExpression re;
> QRegularExpressionMatch m;
> if (getString().contains(re, &m))
> use(m); // ok, `m` is keeping a copy/move of the temporary alive
Implementation wise, it's pretty much straightforward given the code is
already there. It's mostly going to be a documentation issue -- removing
the notes that say that the above is unsupported.
Opinions?
[1] https://doc-snapshots.qt.io/qt6-dev/qregularexpression.html#match
[2] https://eel.is/c++draft/range.view#2
[3] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2415r2.html
[4] https://codereview.qt-project.org/c/qt/qtbase/+/407325
[5] https://eel.is/c++draft/range.all#general-2
[6] https://en.cppreference.com/w/cpp/ranges/all_view
--
Giuseppe D'Angelo | giuseppe.dangelo at kdab.com | Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com
KDAB - The Qt, C++ and OpenGL Experts
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4244 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.qt-project.org/pipermail/development/attachments/20220620/1e47905a/attachment.bin>
More information about the Development
mailing list