[Development] QUtf8String{, View}

Thiago Macieira thiago.macieira at intel.com
Sat May 16 18:16:40 CEST 2020


On sábado, 16 de maio de 2020 08:52:19 PDT Arnaud Clère wrote:
> Regarding the relevance of a QUtf8String, I feel like it would not be so
> useful unless it allows to view its content as QChar instead of char (or
> char8_t) since handling multibyte characters is so error prone. At least a
> QChar handles most unicode characters as single entities...

QUtf8StringIterator can be easily added to extract Unicode codepoints from the 
UTF-8 string like QStringIterator exists for the same in UTF-16.

Usually, though, this means you're doing something wrong. Grapheme clusters 
can span multiple codepoints. Unless you're doing text shaping, you probably 
don't need them. And if you don't need them, why do you care about codepoints 
in the first place?

That opens a philosophical question. In:

    QString s = u"a a\u0301"; // U+0301 COMBINING ACUTE ACCENT
    s.replace('a', 'b');

Should we now have a b with accent? (b́)

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products





More information about the Development mailing list