[Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

Thu Oct 15 15:44:31 CEST 2015

On October 15, 2015 14:53:29 Konstantin Ritt <ritt.ks at gmail.com> wrote:

> 2015-10-15 11:00 GMT+03:00 Bubke Marco <Marco.Bubke at theqtcompany.com<mailto:Marco.Bubke at theqtcompany.com>>:
> On October 15, 2015 08:45:30 Knoll Lars <Lars.Knoll at theqtcompany.com<mailto:Lars.Knoll at theqtcompany.com>> wrote:
>
>> On 14/10/15 23:51, "Bubke Marco" <Marco.Bubke at theqtcompany.com<mailto:Marco.Bubke at theqtcompany.com>> wrote:
>>
>>>On October 14, 2015 23:10:26 Thiago Macieira <thiago.macieira at intel.com<mailto:thiago.macieira at intel.com>>
>>>wrote:
>>>> Qt does not have to provide a comparator that operates on something
>>>>other than
>>>> its native string type.
>>>
>>>Isn't Qt a framework to help developers? Sorry your argumentation is
>>>sounds not very empirical.
>>
>> Of course our aim should be to help developers. But there will always be
>> some use cases which we will not cover. The question is whether this is
>> one of them or not.
>
> Most file and network content is in utf 8, databases too. It has simply a size and performance advantage for most cases. You have not so many cases where you have pure Chinese signs in an text. Mostly it is an mixture. In Linux,  which is very important in embedded, utf 8 dominates ?he APIs. Ask your self if we don't want support that. We could start simply and expand slowly. If the standard library would support utf 8 collations on all platforms very well we could skip it but today you have to do your own solutions again and again.
>
> For everything but US-ASCII / Latin-1, UTF-8 isn't faster than UTF-16 (feel free to compare their complexity against UTF-32).

Do you have mesured it? Please no  theoretical discussions. If you have larger texts you have annotations in ascii like XML.

> And why "pure Chinese signs" again? Did you ever look into the Unicode's Scripts.txt [1], for example? It clearly shows UTF-16 covers [almost] all spoken languages, without any performance hits (in compare to UTF-8), and all we have to pay is an extra byte per every Base Latin character (in compare to UTF-8, again).

Okay,  again. Most text today is embedded in meta information and this information are very often ascii. Why do you think utf8 is used so widely?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20151015/add10655/attachment.html>