[Development] QAnyStringView

Thiago Macieira thiago.macieira at intel.com
Wed Jun 24 02:36:24 CEST 2020


On Tuesday, 23 June 2020 02:35:05 PDT Marc Mutz via Development wrote:
> I have come to believe that QUtf8StringView without QAnyStringView won't
> fly: Introducing QUtf8StringView without QAnyStringView will explode the
> number of mixed-type operations we need to support.

Hello Marc

Thank you for posting this and starting the discussion.

Question, what are the "mixed-typed operations we need to support?". Where do 
you see the need for this?

> The best we can do to condense this down is
> to revoke string-ness of QByteArray and we'd be left with
> 
> - QStringView
> - QLatin1String
> - QUtf8StringView
> - QChar

Aside from places where an exception is worth it, our string API should:
- take QString by const-ref
- return QString by value

That condenses our four types to one for almost the entirety of Qt.

For new API that benefits from the exception, I'd reduce to two:
- QStringView
- QUtf8StringView

But the fact that you listed QChar in the first place indicates that you're 
talking about the string classes themselves. Nothing else uses QChar in our 
API. In that case, yes, QLatin1String and QChar are part of the overload set.

I'm going to restrict my answer from this point forward to the string classes 
themselves. For everything else, there's no apparent need for QAnyStringView. 
Whether it could benefit from QAnyStringView if it exists is a different 
story.

> the latter would have to accept plain char again, something we
> ASCII_DEPRECATED years ago, but should be re-considered under the new
> src-is-UTF-8 paradigm.

Agreed. SG16 is envious of us.

> Assuming for the sake of argument that we need those four types,
> consider QString::replace(). Experience shows that stuff like
> QStringBuilder expressions being passed will require an actual QString
> overload to be present, too. Ignoring existing overloads and regexp,
> we'd need 5x5=25 overloads. I won't enumerate them here. What I will
> enumerate is the complete set of overloads when using QAnyStringView:
> 
>     QString& QString::replace(QAnyStringView, QAnyStringView,
> Qt::CaseSensitivity);
> 
> That's it.
> 
> Unlike QStringView, QAnyStringView is a pure interface type. I won't add
> much in the way of parsing API to it, even though I acknowledge that's a
> slippery slope. While it would be easy to add trimmed(), and tokenize()
> would be really interesting, QAnyStringView should not be used for
> parsing. That's what we have the three non-variant string view types
> for. Being a pure interface type means we can add more "dangerous"
> conversions. QStringView can't be constructed from a QStringBuilder,
> e.g., because it's almost impossible to make that work without
> referencing destroyed data:
> 
>     QStringView s = u'c' + QString::number(x); // oops
>     QString c = u'c' + QString::number(x);
>     QStringView s = c; // ok
> 
> But QAnyStringView supports this:
> 
>     str.replace(name, name % "_1");

That's not the same code. In one you're creating a view object and accessing 
it later outside of the same statement; in the other, it is created and 
accessed in the same statement. That is to say, the following works:

  void foo(QStringView str);
  foo(u'c' + QString::number(x));

and the following doesn't:

 QAnyStringView s = u'c' + QString::number(x);

> QAnyStringView solves this in the sense that one overload can replace
> many overloads. The complexity is still there, a binary visitation of a
> QAnyStringView produces nine instantiations of the visitor (though that
> can be reduced to six in many cases), but many implementations fall into
> one of just two classes: 1) a function would just call toString() on the
> any-string-view, anyway, in which case the QString construction is taken
> out of user code and centralized in the library. If you think that
> doesn't matter, look at the tst_qstatemachine numbers in
> 
>    https://codereview.qt-project.org/c/qt/qtbase/+/301595 (-10KiB just
> from temporary QString creation and destruction)

I'm leaning towards agreeing to use QAnyStringView in the string classes.

I'll remove my -2.

> 2) the complexity is already there and QAnyStringView helps in reducing
> it:
> 
>    https://codereview.qt-project.org/c/qt/qtbase/+/303483 (QCalendar)
>    https://codereview.qt-project.org/c/qt/qtbase/+/303512 (QColor)
>    https://codereview.qt-project.org/c/qt/qtbase/+/303707 (arg())
>    https://codereview.qt-project.org/c/qt/qtbase/+/303708 (QUuid)

Agreed on arg(), it's a great clean-up and performance improvement.

But it's part of QString itself. The other ones, however, are the slippery 
slope. I agree they improve performance for sink-only functions, but we don't 
*need* QAnyStringView for them. For example, for QCalendar, they could be the 
QStringView/QUtf8StringView pair.

My problem is not with the clean up that it provides, it's adding yet another 
class to our API.

> Now that I hopefully have convinced you that we need QAnyStringView,
> where to go from here?
> 
> Given the lack of time until Qt 6.0, I'd like to propose to just replace
> all overload sets that contain QL1S with one overload taking
> QAnyStringView

Agreed for the string classes themselves.

> The implementation usually contains the optimized handling of L1 data
> already, and can often be easily extended to UTF-8, too, cf. QColor,
> QUuid, arg().

Those are likely candidates, yes.

I just don't want to give blanket approval for everything. There may be places 
where the correct solution is to delete the QLatin1String overload and keep 
only QString.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products





More information about the Development mailing list