[Development] char8_t summary?
Mutz, Marc
marc at kdab.com
Sun Jul 14 08:28:58 CEST 2019
On 2019-07-13 21:39, Volker Hilsheimer wrote:
>> On 13 Jul 2019, at 13:41, Thiago Macieira <thiago.macieira at intel.com>
>> wrote:
>> On Friday, 12 July 2019 17:37:59 -03 Matthew Woehlke wrote:
>>> That said, I took a look at startsWith, and... surprise! It is
>>> *already
>>> a template*. So at least in that case, it isn't obvious why adding
>>> more
>>> combinations would be so terribly onerous.
>>
>> Again, note how the template implicitly assumes things. A 3-character
>> string
>> cannot be present at the beginning (startsWith), end (endsWith) or
>> anywhere in
>> the middle (contains, indexOf, lastIndexOf) of a 2-character one, for
>> example.
>>
>> But a 2- and 3-byte UTF-8 string can be the prefix of a 1-character
>> UTF-16
>> string and a 4-byte UTF-8 string can be the prefix of a 2-codeunit
>> UTF-16 (1
>> character). That means implementing UTF-8 functions requires different
>> algorithms in the first place. That means templates are not usually
>> the
>> answer.
>>
>> I'm not saying impossible. You can, by writing sufficiently generic
>> algorithms
>> that scan the strings in lockstep (you can scan UTF-8 backwards, after
>> all).
>> But the reason you don't *want* to is that our Latin1 and UTF-16
>> algorithms
>> are optimised, often vectorised, for their purpose. We don't want to
>> lose the
>> efficiency we've already got.
>>
>> And I'm not saying we shouldn't have UTF-8 algorithms or even a
>> QUtf8StringView or some such. It would have helped in CBOR, for
>> example, see
>> QCborStreamWriter:
>> void appendTextString(const char *utf8, qsizetype len);
>>
>> This is one that should at least get the overload.
>>
>> --
>> Thiago Macieira - thiago.macieira (AT) intel.com
>> Software Architect - Intel System Software Products
>
>
> As I understood the template suggestion, it’s more about not having to
> add 64 different overloads (or several more string classes) to the Qt
> API, and less about unifying all implementations into a single set of
> algorithms.
[I'm replying to Volker, but this should be read as replying to
everyone, and 'you' should be read as the plural form]
There's a bonus for documentability, of course, by using templates: one
template vs. 64 explicit overloads. I hasten to add that the 64 is
counting *this, so we're back to 16 for documentation purposes, because
no-one is proposing to remove the member functions and only provide the
free functions that back them, and that it's harder to document what a
template accepts than it is to document 16 overloads, now that we can
have multiple \fn per qdoc comment block.
But that doesn't reduce the number of overloads. That template will be
instantiated 16 times (and more, as it's hard to ignore const/non-const
without forcing a copy, and even with a copy, the template function
doesn't do implicit conversions the way an ordinary function would).
Those instantiations are functions. Inline ones, hopefully, but
nonetheless functions. It will not help compile-times, and it will
degrade the error messages from the compiler, even if we (as we should)
constrain the template.
As an example of what all of this means, look at
https://codereview.qt-project.org/c/qt/qtbase/+/181620, which is doing
exactly that: make a former non-template a template function. Not even
Thiago is sure it won't break code, and while I'd like to stand in front
of you and claim that I designed it so that there _is_ no difference, in
practice I wouldn't bet that some obscure compiler (like MSVC or the
Integrity one) won't throw logs^Wtrunks in my way by the time I hit
submit. Or look at QStringView ctors. It's a bit harder than it needs to
be, because QStringView can't depend on QString in-size (because QString
does on QStringView), but you're basically asking to make every string
class member function that takes another string a mixture of
QString::arg() as proposed in 181620 and current QStringView
construction.
Besides, as we all know, you can't partially-specialise function
templates, so if you write 'specialise' what you're saying is either
'overload' or 'add a template struct with static members, partially
specialise the struct' (iow: overloads).
I hope this convinces everyone to finally closes the lid on the box
labelled 'use templates and everything will be oh so easy'.
Will we (have to) use templates? Yes. Will it reduce the number of
overloads? Only if you want to inflict pain on your users.
If you're still not convinced, here's QStringView::endsWith() as a
template:
template <typename Prefix>
requires std::is_convertible_v<Prefix, QStringView // || ...
Qtf8StringView, || ... QLatin1StringView ...
Q_ALWAYS_INLINE
bool endsWith(Prefix &p) const {
return QtPrivate::endsWith(*this,
QtPrivate::qStringLikeToStringView(p));
}
with a qStringLikeToStringView() similar to the one in 181620. This uses
C++20, and I'm sure it loses something over the current implementation.
Qt::CaseSensitivity comes to mind. To anyone speaking up in favour of
the box: Please write this in C++11 before you hit reply :)
Thanks,
Marc
More information about the Development
mailing list