[Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

Fri Oct 16 09:02:15 CEST 2015

This is getting way off topic with regards to QStringView...

On 16/10/15 07:07, "Thiago Macieira" <thiago.macieira at intel.com> wrote:

>On Thursday 15 October 2015 21:02:09 Bubke Marco wrote:
>> Actually I think Qt is not main developing library people use. It is
>>there
>> to make the boring stuff easy, to hide the different interfaces between
>> different platforms. That is why many people use Qt,  they want to have
>>a
>> GUI but don't want to invest to much time in it. The interesting stuff
>> which is differentiating you from others is mostly home grown in
>>connection
>> with much more specialized libraries. And this libraries are much more
>> important to the users. So we should support them, their interfaces and
>>not
>> force our interfaces on them.

Other libraries are important to our users, but you underestimate how many
of them are using Qt if you think they are more important. And in any
case, we do not know the APIs that these libraries are using.
>> 
>
>That's a slippery slope. If we have to support every library's interface,
>we'll have a horrible mish-mash of an API that tries to support
>everything and 
>doesn't support anything really well.
>
>I'd rather we supported the "One Qt Way" really well. Supporting other
>ways is 
>possible, as long as we don't cause too much maintenance trouble.

Yes. Consistency in our APIs and ease of use is extremely important. This
is a lot of what made Qt what it is today.
>
>> How many users use the standard library too,
>> especially the new features, why don't we support them not much better.
>>Why
>> do we have to reinvent the wheel again and again.
>
>I won't bother repeating the arguments of why we can't use some of the
>standard library features. And see the discussion on std::chrono as a
>replacement for QTime(Span).

Let’s take this the other way round. Many developers shy away from C++
because they see the STL as rather complex and hard to use API. They
learning curve for newcomers is extremely steep.

Qt has always been about making C++ easier to use, and bringing it in line
with other languages such as Java. This has been a good part of the reason
behind our success. Let’s not forget that many of our users are not C++
experts. They are often engineers that simply want to get their work done.
>
>> I know binary
>> compatibility is important for you but is it really that important
>>outside
>> of the special linux distribution cocoon.
>
>Yes. Lots of users don't recompile the world when Qt issues a new
>release, but 
>still try to upgrade.
>
>And besides, if we have to maintain binary compatibility for 99% of the
>API 
>anyway because of Linux distributions, then we might as well go the extra
>mile 
>and keep it for everyone.
>
>> Is it important under Windows,
>> is it important under Mac, is it important under embedded Linux? I think
>> the advantages are smaller than the drawbacks.
>
>I disagree.
>
>Not to mention that "embedded Linux" these days is not different from
>"regular 
>Linux". Even devices with 32 MB of RAM have package managers and install
>software from a central repository.

This is a discussion that’s recurring every two years. Apart from the
upgradability, there is the large advantage that keeping SC and BC is a
way to restrict ourselves from changing too much. This is not to be
underestimated, as our users very much value the stability.

Yes, I also have a list of things I’d like to change in a non BC way, so
probably have many other people on the mailing list. But there is no way,
we would be able to do all these changes without also breaking source
compatibility. And that is something we can only do very rarely.
>
>> > That's one of the two main advantages of native code. There's no
>>sandbox
>> > to
>> > escape from.
>> > 
>> > Qt already supports doing locale-aware comparison. We even have a
>>class
>> > for
>> > it, so it can be done efficiently: QCollator and it supports our
>>native
>> > string type (QString).
>> 
>> Do you like to live on a native island?
>
>Yes. I love writing native code. That's one of the biggest powers of Qt
>compared to any other cross-platform solution out there. So, yeah, native
>island is good.
>
>Did you mean to ask if I like constraining myself to only Qt-style APIs?
>The 
>answer is also yes. I hate the Standard Library API because it's
>confusing (to 
>me, obviously not so much to the people who wrote it), limited in
>convenience 
>forcing me to write more code than focusing on getting stuff done. I
>often feel 
>that the Standard Library tries to achieve 100% support of some tiny
>feature, 
>overengineering it and not focusing on making developers' lives easier.
>The 
>example is std::chrono.

+1. While I see that lots is happening in STL land, unfortunately the APIs
feel very alien to someone being used to Qt style APIs. They feel even
worse for someone coming from Java or most other languages.
>
>> > Providing extra support for a character encoding that is not what
>>QString
>> > uses falls in that 1%. Just use ICU.
>> 
>> You arguments sounds very tautological. Because it is unimportant we
>>don't
>> have it a string class for it. It is unimportant because QString is not
>> supporting it.
>
>You're misrepresenting the argument. QString doesn't support other
>encodings 
>because UTF-16 is the best for the task at hand and we have too much
>legacy to 
>support. Because of that, QCollator only supports UTF-16.
>
>> I know you love plationian argumentation but it would be much more
>>effective
>> if you would try to get in the context of other and understand their
>> arguments in their context.  Showing in your own context that their
>> arguments "makes no sense" is not very useful.
>
>You're dangerously close to attacking me instead of attacking my
>arguments.
>
>> > That example shows how UTF-16 is better. See above on seekability of
>> > UTF-16 vs UTF-8.
>> > 
>> > The solution for this is to fix the library to accept UTF-16. When we
>>were
>> > doing Qt 5.0, we needed PCRE to support UTF-16. Their developers were
>>very
>> > welcoming and wrote the version that supports UTF-16, so Qt does not
>>need
>> > to reallocate.
>> 
>> You have ever heard of Pippi Longstocking: "Widdiwiddiwitt, we make the
>> world like we wish it should be. "  or how it is translated to english.
>>You
>> really think that you can force other larger projects to use utf16
>>instead
>> of utf8 if it has disadvantages for them.
>
>And why should we support UTF-8 instead of UTF-16 if it has disadvantages
>for 
>us?
>
>We'll have to agree to disagree with those projects. We've chosen UTF-16
>and 
>we're aligned with a lot of other important API.

The problem here is that depending on what you’re doing UTF-8 or UTF-16
might be seen as the better encoding. Yes, many files and content you get
from the internet is encoded in utf-8 these days. Still many APIs we and
our users need to interface with use utf-16.

So you won’t be able to avoid conversions in any case. Qt has at some
point (16 year in the past) chosen utf-16 as it’s encoding. Changing that
today is extremely difficult, and IMO not worth the effort. We’d simply
exchange one set of problems with another, while at the same time breaking
literally hundreds of millions of lines of code out there.

And again: Duplicating our API to support both is a very bad idea.

Utf-16 btw is the native encoding used by many frameworks. If you don’t
limit yourself to linux/posix (where utf-8 is simply being used because
many APIs where taking char * pointers and utf-8 is the only way to make
that work), you’ll see that many other languages and APIs actually use
utf-16 as their native encoding. To list a few: ICU, Cocoa, Win32, WinRT,
Java, Javascript, WebKit/Chromium all use utf-16 as their underlying
string encoding.

Btw, conversions between utf-8 and utf-16 are very fast, and string memory
usage is rarely a huge issue in applications.

To sum up this part of the discussion: I don’t see many advantages in
utf-8, but many disadvantages given where we are today.

Now let’s get back on topic about QStringView ;-)

Cheers,
Lars

>
>> And the PCRE is now supporting both at  runtime? Especially for large
>>text
>> it would be very helpful if you don't need to convert them to QString
>> before you use  regular expressions on it.
>
>As far as I know, it operates entirely on UTF-16 in memory if the input
>was 
>UTF-16.
>
>> > Way too much code would break if we did that because we allow people
>> > access to the data pointer in QString and to iterate directly
>> > (std::{,w,u16}string don't allow that, which makes parsing them
>>actually
>> > a lot more cumbersome).
>> I don't see the disadvantage if you have special iterators.
>
>As long as QString contines to be backed by UTF-16, providing iterators
>is 
>fine. In fact, we have an iterator for UCS-4: it's called QStringIterator
>and 
>has been there for a year.
>
>> That is the
>> power of iterators and with the new features of C++ they get really
>>useful.
>> But anyway,  I don't say that we have to change everything. The last
>>time
>> we did that we broke our event system which is still not working like it
>> was before we introduced QWindow. I think we  should have an
>>evolutionary
>> process to  adapt to the  changing environment and not try to reiterate
>> what was successful in the past.
>
>I agree with what you said in this paragraph. But it does not lead to a
>conclusion about using UTF-8 or even providing our own UTF-8 class.
>
>-- 
>Thiago Macieira - thiago.macieira (AT) intel.com
>  Software Architect - Intel Open Source Technology Center
>
>_______________________________________________
>Development mailing list
>Development at qt-project.org
>http://lists.qt-project.org/mailman/listinfo/development