[Development] Why can't QString use UTF-8 internally?

Matthew Woehlke mw_triad at users.sourceforge.net
Wed Feb 11 18:59:17 CET 2015


On 2015-02-11 12:00, Thiago Macieira wrote:
> On Wednesday 11 February 2015 11:49:49 Matthew Woehlke wrote:
>> I'm not going to claim this is the *best* answer, but at least one that
>> seems logical... length() should be the number of times one must hit
>> backspace starting from the end of the text to erase the entire text.
> 
> That will depend on the editor. Some may remove the full character with all 
> the combining characters, some others may not.

Yeah, I thought of that :-/. TBH I think these sorts of things should be
specified by Unicode (if they don't already; I would rather hope they
do) rather than Qt trying to decide how to answer them.

>> Conversely, I'm sure there are times when you need to know the number of
>> codepoints (e.g. allocating memory to make a copy). Possibly length()
>> and size() should return different results. (Which is a mess, but...)
> 
> Uh... no, that's not a good idea.
> 
> If we were going do to something like that, we'd have to find a less confusing 
> name. Something like width().

Well... yes, for the sake of compatibility I'm inclined to agree.
Changing the meaning of one or both of these, or that they are presently
synonyms, would confuse the heck out of people. That said... Bo *did*
specify "length()" when he wanted a method to return logical characters
and not codepoints. It may be that he's just out of luck there...

(@Konstantin, yes I'm aware that logical glyphs != codepoints... that
was the whole point of Bo's original request, at least as I understood it.)

-- 
Matthew




More information about the Development mailing list