[Development] Why can't QString use UTF-8 internally?

Bo Thorsen bo at vikingsoft.eu
Wed Feb 11 11:06:56 CET 2015


Den 11-02-2015 kl. 10:48 skrev Olivier Goffart:
> On Wednesday 11 February 2015 10:32:31 Bo Thorsen wrote:
>> This would make me very unhappy. I'm doing a customer project right now
>> that uses std::string all over the place and there is real pain involved
>> in this. It's an almost empty layer over char* and brings none of the
>> features of QString. Of all the failures of the C++ standards committee,
>> std::string is the worst.
>>
>> Any string class has to be unicode. What it uses internally is an
>> implementation detail (which is what started this thread). It's fine to
>> have a pure ascii string type as well, but there are so few cases left
>> in real world applications where this is useful.
>>
>> What QString internally uses is a pure optimization question, and I'll
>> leave that to others. But whatever is decided, I want to be sure it
>> keeps some of the things QString offers:
>>
>> 1) Unicode! Don't assume the user remembers to use utf8.
>> qlabel->setText(stdString) *will* fail. Leaving decisions on encoding to
>> users is a bad idea.
>>
>> 2) length() returns the number of chars I see on the screen, not a
>> random implementation detail of the chosen encoding.
>>
>> 3) at(int) and [] gives the unicode char, not a random encoding char.
>>
>> std::string fails at those completely basic requirement, which is why
>> you will never see me use it, unless some customer API demands it or I'm
>> in one of those exceptional cases where there is sure to be ascii only
>> in the strings.
>>
>> Another note: Latin1 is the worst idea for i18n ever invented, and it's
>> by now useless, irrelevant and only a source for bugs once you start to
>> truly support i18n outside of USA and Western Europe. I would be one
>> step closer to total happiness if C++17 and Qt7 makes this "encoding"
>> completely unsupported.
>>
>> I know this I've made the statements here a bit harsh, but I see the
>> same kinds of problems again and again in customer code, when they chose
>> to use std::string all over the place. They give the same arguments I've
>> seen here - optimized, faster, etc - and add a few like "easier to
>> switch away from Qt, backend is std/boost only and no Qt allowed and so
>> on". And they pay for it in development time, bugfixing and angry users.
>>
>> Sure, QString isn't optimized for some cases. But I'll take a less
>> optimized class any day over something that brings heaps of bugs. Then I
>> have time to focus on optimizing the serious things instead of fixing bugs.
>
> Again, std::string is not an equivalent of QString, it is a equivalent of
> QByteArray.
> The equivalent of QString would be std::wstring or std::u16string. And those
> classes have the properties you desire.

I know. I responded to a suggestion to use std::string with utf8 encoding.

Bo Thorsen,
Director, Viking Software.

-- 
Viking Software
Qt and C++ developers for hire
http://www.vikingsoft.eu



More information about the Development mailing list