[Development] QString and related changes for Qt 6

Tue May 12 23:21:54 CEST 2020

On Tuesday, 12 May 2020 08:42:28 PDT Matthew Woehlke wrote:
> How will this work? As I understand, the main advantage to
> QStringLiteral is that it statically encodes the *length* as well as the
> data. This isn't possible with raw literals, which are merely
> NUL-terminated.

Black magic!

I mean, templates and constexpr. QStringView has these two constructors:

    template <typename Char, size_t N>
    Q_DECL_CONSTEXPR QStringView(const Char (&array)[N]) noexcept;

    template <typename Char>
    Q_DECL_CONSTEXPR QStringView(const Char *str) noexcept;

The first one has a clear-cut size and can be initialised from a character 
literal. The second one can attempt to determine at constexpr time what the 
string length is.

It can't do so today (5.15) because of the lack of if constexpr. But Qt 6.0 
will require C++17, so it can use if constexpr and implement a scan-for-NUL at 
constexpr time if the payload is also constexpr. If it isn't, then it falls 
back to calling qustrlen().

> Even std::string wants literals for this reason. A UDL would obviously
> be superior, but I don't see us ever getting rid of some form of QString
> literal short of templatizing *everything* that takes a T* (for T in
> char, char16_t, etc.) to take a T(&)[N] instead.

	u"foo"_qs
	u"foo"_qsv;

But QStringView(u"foo") should call that first constructor. Doesn't it? I 
never remember if the literal decays to pointer before the overload 
resolution.

> > In most other places we should by default only use QString, unless
> > there are very significant performance benefits to be had from using
> > QStringView. This helps us keep an API that’s both easy to use and
> > maintain. With the ideas above, you can still create a read-only
> > string, so data copies can in many cases be avoided if required.
> 
> Really? How?
> 
> The "nice" thing about QStringView is that it does not have ownership;
> you have to be careful about how long you hold onto it lest it turn into
> a dangling pointer. You can't construct a QString from any old bag of
> byt^Wcharacters because a QString is implicitly valid until it is destroyed.

That's the problem we've had with QStringLiteral and QString::fromRawData().

You *can* create it from read-only data and tell it never to try to modify. 
The trick is guaranteeing that it remains valid until the last user finished 
using it. Because of copy-on-write, that last user can be much later than the 
statement that created the QString in the first place.

One way to ensure that guarantee is to never unload/free the memory block in 
the first place. We already don't unload plugins for this and similar reasons.

One thing Lars and I agree is that those literals must be null-terminated, 
unlike QStringView. Whether it's simply an API contract or whether we test/
enforce remains to be seen. On the platforms where Qt runs, we can almost 
always read past the end of the string to see if the terminator is there, even 
if it means writing assembly code.

> That said, I think I understand the reasoning here; make it up front
> that the input is going to wind up in *a* QString. If the user's input
> is *already* a QString, the function can make a shared copy rather than
> constructing a brand new one. However, it would be nice for such
> functions to offer r-value reference overloads for cases where a QString
> needs to be created, or if the user is done with their copy. (Actually,
> a possibly-owning reference wrapper could be useful here...)

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products