[Development] QString and related changes for Qt 6

Wed May 13 08:27:29 CEST 2020

> On 12 May 2020, at 23:21, Thiago Macieira <thiago.macieira at intel.com> wrote:
> 
> On Tuesday, 12 May 2020 08:42:28 PDT Matthew Woehlke wrote:
>> How will this work? As I understand, the main advantage to
>> QStringLiteral is that it statically encodes the *length* as well as the
>> data. This isn't possible with raw literals, which are merely
>> NUL-terminated.
> 
> Black magic!
> 
> I mean, templates and constexpr. QStringView has these two constructors:
> 
>    template <typename Char, size_t N>
>    Q_DECL_CONSTEXPR QStringView(const Char (&array)[N]) noexcept;
> 
>    template <typename Char>
>    Q_DECL_CONSTEXPR QStringView(const Char *str) noexcept;
> 
> The first one has a clear-cut size and can be initialised from a character 
> literal. The second one can attempt to determine at constexpr time what the 
> string length is.
> 
> It can't do so today (5.15) because of the lack of if constexpr. But Qt 6.0 
> will require C++17, so it can use if constexpr and implement a scan-for-NUL at 
> constexpr time if the payload is also constexpr. If it isn't, then it falls 
> back to calling qustrlen().
> 
>> Even std::string wants literals for this reason. A UDL would obviously
>> be superior, but I don't see us ever getting rid of some form of QString
>> literal short of templatizing *everything* that takes a T* (for T in
>> char, char16_t, etc.) to take a T(&)[N] instead.
> 
> 	u"foo"_qs
> 	u"foo"_qsv;
> 
> But QStringView(u"foo") should call that first constructor. Doesn't it? I 
> never remember if the literal decays to pointer before the overload 
> resolution.
> 
>>> In most other places we should by default only use QString, unless
>>> there are very significant performance benefits to be had from using
>>> QStringView. This helps us keep an API that’s both easy to use and
>>> maintain. With the ideas above, you can still create a read-only
>>> string, so data copies can in many cases be avoided if required.
>> 
>> Really? How?
>> 
>> The "nice" thing about QStringView is that it does not have ownership;
>> you have to be careful about how long you hold onto it lest it turn into
>> a dangling pointer. You can't construct a QString from any old bag of
>> byt^Wcharacters because a QString is implicitly valid until it is destroyed.
> 
> That's the problem we've had with QStringLiteral and QString::fromRawData().
> 
> You *can* create it from read-only data and tell it never to try to modify. 
> The trick is guaranteeing that it remains valid until the last user finished 
> using it. Because of copy-on-write, that last user can be much later than the 
> statement that created the QString in the first place.
> 
> One way to ensure that guarantee is to never unload/free the memory block in 
> the first place. We already don't unload plugins for this and similar reasons.

I have partial patches (they still need some more work) where we can create a QString from read-only data. This is possible because QString in Qt 6 has a begin/end pointer in the class itself (not in the d-pointer).

So a read-only QString would contain a null d-pointer plus the pointer to data and size/end.

To avoid problems with plugins, we have two options. Either we continue not unloading them (safe bet), or we disable those constructors when compiling plugin code, and enforce a copy of the data in that case. 
> 
> One thing Lars and I agree is that those literals must be null-terminated, 
> unlike QStringView. Whether it's simply an API contract or whether we test/
> enforce remains to be seen. On the platforms where Qt runs, we can almost 
> always read past the end of the string to see if the terminator is there, even 
> if it means writing assembly code.

Ideally, we can check this at compile time for most cases. We have been making that assumption, but not checking it in Qt5’s QString (you could get a non zero terminated string by using fromRawData()). 

Cheers,
Lars

> 
>> That said, I think I understand the reasoning here; make it up front
>> that the input is going to wind up in *a* QString. If the user's input
>> is *already* a QString, the function can make a shared copy rather than
>> constructing a brand new one. However, it would be nice for such
>> functions to offer r-value reference overloads for cases where a QString
>> needs to be created, or if the user is done with their copy. (Actually,
>> a possibly-owning reference wrapper could be useful here...)
> 
> -- 
> Thiago Macieira - thiago.macieira (AT) intel.com
>  Software Architect - Intel System Software Products
> 
> 
> 
> _______________________________________________
> Development mailing list
> Development at qt-project.org
> https://lists.qt-project.org/listinfo/development