[Development] char8_t summary?

Thu Jul 11 21:01:27 CEST 2019

On Thursday, 11 July 2019 13:41:49 -03 Matthew Woehlke wrote:
> On 11/07/2019 05.05, Mutz, Marc via Development wrote:
> > There is a cost associated with another string class, too, and it's
> > combinatorial explosion. Even when we have all view types
> > (QLatin1StringView, QUtf8StringView, QStringView), consider the overload
> > set of QString::replace(), ignoring the (ptr, size) variants:
> > 
> >    {QL1V, QU8V, QSV, QChar} x {QL1V, QU8V, QSV, QChar}
> > 
> > that's 16 overloads. And that's without a possible QUtf32StringView.
> 
> So?
> 
> The right way to handle this is for those methods to be templated, in
> which case a) the code only needs to be written O(1) times, not O(N)
> times, and b) users can potentially specialize for their own string
> types as well.

Except that the whole point of those methods is that they can be more 
efficient when the encoding is known and therefore templating won't help. 
Templating won't make overload resolution any faster, but will make 
compilation times slower. And if we want to make use of the fact that a string 
is UTF-8, the templates won't work.

Right now, we know bytelength(latin1string) == codepointlength(utf16string), 
so we know how to efficiently replace and we apply that knowledge to indexOf, 
startsWith, endsWith, etc.. That's not the case for UTF-8, so algorithms will 
begin to differ very quickly.

> If done cleverly, even the (pointer, size) variants should be able to
> wrap the arguments in a View, such that those method definitions are
> trivial.

View = (pointer,size) pair.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products