[Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

André Somers andre at familiesomers.nl
Fri Oct 16 09:03:09 CEST 2015


Op 15-10-2015 om 18:40 schreef Konstantin Ritt:
> 2015-10-15 17:52 GMT+03:00 André Somers <andre at familiesomers.nl 
> <mailto:andre at familiesomers.nl>>:
>
>     Op 15-10-2015 om 14:52 schreef Konstantin Ritt:
>     >
>     >
>     > For everything but US-ASCII / Latin-1, UTF-8 isn't faster than
>     UTF-16
>     > (feel free to compare their complexity against UTF-32).
>     > And why "pure Chinese signs" again? Did you ever look into the
>     > Unicode's Scripts.txt [1], for example? It clearly shows UTF-16
>     covers
>     > [almost] all spoken languages, without any performance hits (in
>     > compare to UTF-8), and all we have to pay is an extra byte per every
>     > Base Latin character (in compare to UTF-8, again).
>     >
>     > [1] http://www.unicode.org/Public/8.0.0/ucd/Scripts.txt
>     >
>     "All we have to pay"? Isn't that quite a significant cost, if your
>     every
>     other byte in your data is going to be null?
>
>
> Only for US-ASCII / Latin-1.
>
What percentage of the strings you deal with in your applications falls 
withing that category? I can tell you that for us (dealing with 
databases, XML, etc.) that percentage is quite high. Even if we do 
translate our user interfaces to support Chinese among other languages. 
The vast majority of the strings in our application would fare very well 
with UTF-8 indeed. The user visible strings are but a tiny fraction. You 
have also been shown the source for Google.cn page as a nice example.

André


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20151016/7cda146f/attachment.html>


More information about the Development mailing list