[Development] char8_t summary?

Mutz, Marc marc at kdab.com
Sun Jul 14 08:28:58 CEST 2019


On 2019-07-13 21:39, Volker Hilsheimer wrote:
>> On 13 Jul 2019, at 13:41, Thiago Macieira <thiago.macieira at intel.com> 
>> wrote:
>> On Friday, 12 July 2019 17:37:59 -03 Matthew Woehlke wrote:
>>> That said, I took a look at startsWith, and... surprise! It is 
>>> *already
>>> a template*. So at least in that case, it isn't obvious why adding 
>>> more
>>> combinations would be so terribly onerous.
>> 
>> Again, note how the template implicitly assumes things. A 3-character 
>> string
>> cannot be present at the beginning (startsWith), end (endsWith) or 
>> anywhere in
>> the middle (contains, indexOf, lastIndexOf) of a 2-character one, for 
>> example.
>> 
>> But a 2- and 3-byte UTF-8 string can be the prefix of a 1-character 
>> UTF-16
>> string and a 4-byte UTF-8 string can be the prefix of a 2-codeunit 
>> UTF-16 (1
>> character). That means implementing UTF-8 functions requires different
>> algorithms in the first place. That means templates are not usually 
>> the
>> answer.
>> 
>> I'm not saying impossible. You can, by writing sufficiently generic 
>> algorithms
>> that scan the strings in lockstep (you can scan UTF-8 backwards, after 
>> all).
>> But the reason you don't *want* to is that our Latin1 and UTF-16 
>> algorithms
>> are optimised, often vectorised, for their purpose. We don't want to 
>> lose the
>> efficiency we've already got.
>> 
>> And I'm not saying we shouldn't have UTF-8 algorithms or even a
>> QUtf8StringView or some such. It would have helped in CBOR, for 
>> example, see
>> QCborStreamWriter:
>>    void appendTextString(const char *utf8, qsizetype len);
>> 
>> This is one that should at least get the overload.
>> 
>> --
>> Thiago Macieira - thiago.macieira (AT) intel.com
>>  Software Architect - Intel System Software Products
> 
> 
> As I understood the template suggestion, it’s more about not having to
> add 64 different overloads (or several more string classes) to the Qt
> API, and less about unifying all implementations into a single set of
> algorithms.

[I'm replying to Volker, but this should be read as replying to 
everyone, and 'you' should be read as the plural form]

There's a bonus for documentability, of course, by using templates: one 
template vs. 64 explicit overloads. I hasten to add that the 64 is 
counting *this, so we're back to 16 for documentation purposes, because 
no-one is proposing to remove the member functions and only provide the 
free functions that back them, and that it's harder to document what a 
template accepts than it is to document 16 overloads, now that we can 
have multiple \fn per qdoc comment block.

But that doesn't reduce the number of overloads. That template will be 
instantiated 16 times (and more, as it's hard to ignore const/non-const 
without forcing a copy, and even with a copy, the template function 
doesn't do implicit conversions the way an ordinary function would). 
Those instantiations are functions. Inline ones, hopefully, but 
nonetheless functions. It will not help compile-times, and it will 
degrade the error messages from the compiler, even if we (as we should) 
constrain the template.

As an example of what all of this means, look at 
https://codereview.qt-project.org/c/qt/qtbase/+/181620, which is doing 
exactly that: make a former non-template a template function. Not even 
Thiago is sure it won't break code, and while I'd like to stand in front 
of you and claim that I designed it so that there _is_ no difference, in 
practice I wouldn't bet that some obscure compiler (like MSVC or the 
Integrity one) won't throw logs^Wtrunks in my way by the time I hit 
submit. Or look at QStringView ctors. It's a bit harder than it needs to 
be, because QStringView can't depend on QString in-size (because QString 
does on QStringView), but you're basically asking to make every string 
class member function that takes another string a mixture of 
QString::arg() as proposed in 181620 and current QStringView 
construction.

Besides, as we all know, you can't partially-specialise function 
templates, so if you write 'specialise' what you're saying is either 
'overload' or 'add a template struct with static members, partially 
specialise the struct' (iow: overloads).

I hope this convinces everyone to finally closes the lid on the box 
labelled 'use templates and everything will be oh so easy'.

Will we (have to) use templates? Yes. Will it reduce the number of 
overloads? Only if you want to inflict pain on your users.

If you're still not convinced, here's QStringView::endsWith() as a 
template:

    template <typename Prefix>
        requires std::is_convertible_v<Prefix, QStringView // || ... 
Qtf8StringView, || ... QLatin1StringView  ...
    Q_ALWAYS_INLINE
    bool endsWith(Prefix &p) const {
        return QtPrivate::endsWith(*this, 
QtPrivate::qStringLikeToStringView(p));
    }

with a qStringLikeToStringView() similar to the one in 181620. This uses 
C++20, and I'm sure it loses something over the current implementation. 
Qt::CaseSensitivity comes to mind. To anyone speaking up in favour of 
the box: Please write this in C++11 before you hit reply :)

Thanks,
Marc



More information about the Development mailing list