[Development] Why can't QString use UTF-8 internally?

Thiago Macieira thiago.macieira at intel.com
Wed Feb 11 17:35:20 CET 2015


On Wednesday 11 February 2015 14:23:08 Tomaz Canabrava wrote:
> On Wed, Feb 11, 2015 at 2:20 PM, Guido Seifert <wargand at gmx.de> wrote:
> > Minor OT, but I am too curious... do you have an example?
> > Are there really cases were turning lower case into upper case or
> > vice versa changes the length of a string?
> 
> Yes, and he already said such example, ß becomes SS

The other example that was given is 'i' (UTF-8 0x69) becoming 'İ' under a 
Turkish locale (UTF-8 0xc4 0xb0).

Even if you use the new uppercase ẞ, the operation changes length in UTF-8 
(0xc3 0x9f → 0xe1 0xba 0x9e), even though it doesn't in UTF-16.

There are probably more examples.
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center




More information about the Development mailing list