[Development] Deprecation/removal model going into Qt 6

Mon Jun 3 13:34:27 CEST 2019

Giuseppe D'Angelo via Development wrote:

> Il 03/06/19 00:08, Kevin Kofler ha scritto:
>> What you call "obsolete functionality" is functionality that existing
>> code relies on and rightfully expects to remain there.
> 
> Rightfully? By what right exactly?

APIs in libraries are meant to be used. I consider it an entirely reasonable 
expectation by developers using the APIs that they will not be removed under 
them because the library developers consider them "obsolete". Imagine the 
chaos if Intel or AMD decided to remove some random "obsolete" x86 
instructions from their CPUs! x86 has kept backwards compatibility with 
every single instruction for more than 30 years. This is the standard 
software libraries should be held to, too.

>> I'd rather get fewer (or even no) new features than losing existing ones.
> 
> How is this even an argument? Qt will need to evolve and acquire
> features to remain competitive. Again, development bandwidth is finite:
> either the overall quality decreases or some things have to get dropped.

Qt has long reached a point where it can be considered complete. Its main 
selling point is portability to many different platforms rather than some 
specific feature. Additional features don't necessarily need to be in the 
main Qt library, but can be in community-developed addons such as KDE 
Frameworks or such as the many third-party Qt-based Free Software libraries 
out there. (They can also be in Qt-Company-developed Qt Solutions if there 
is manpower left for that.)

Qt has also become larger and larger over time (despite the removal of APIs 
considered obsolete). Just compare the size of the Qt 3 tarball with the 
size of the Qt 5 monolithic tarball. This is not the result of keeping old 
APIs around, but of feature creep.

So I disagree with the assertion that Qt needs more features to remain 
competitive.

>> See also Boudewijn Rempt's blog post on the subject:
>> https://valdyas.org/fading/hacking/happy-porting/
> 
> I agree with the principle (API breaks are painful), but I strongly
> disagree with the idea that no API breaks can ever possibly happen. And
> the specific example is a terrible one to make a point as the resulting
> API break is trivial to work around (I defined such breakages
> "scriptable").

The Q_FOREACH to ranged for change is not as easy to port to as people 
think, because there are at least 2 pitfalls when porting to ranged for:
1. you have to add qAsConst or equivalent or you will be deep-copying your
   implicitly-shared CoW container,
2. code that was changing the container during the iteration, which worked
   just fine with Q_FOREACH (because the iteration would still be over the
   original unchanged container), will now crash without warning. Even
   qAsConst will not help you get a warning or error for it, because it only
   constifies the reference for the ranged for itself and not for the code
   within it.

This is effectively deprecating a safe construct for an unsafe one.

>> An array of pointers is the most efficient data structure in practice
>> (operations are at most O(n)), dropping it in favor of an O(mn) data
>> structure (where m = sizeof(T)) such as QVector is a pessimization. And
>> QList also has the prepend optimization that makes most prepends even
>> O(1) rather than O(n). I don't see why almost everybody hates it.
> 
> As written, the above makes no sense, as it looks like you're comparing
> apples and oranges: time complexities against space complexities.

I'm speaking exclusively of time complexities. The space only matters when 
it goes into the formula for the time, which is the case for QVector.

QList::insert and QList::removeAt have O(n) time complexity.
QVector::insert and QVector::removeAt have O(mn) time complexity.

QList::prepend has O(1) amortized and O(n) worst case time complexity.
QVector::prepend has O(mn) (always!) time complexity.

If you are dealing with a large class or struct, e.g. 800 bytes, then the 
QVector operations are 100 times slower than the QList ones!

> The fact is: once one removes the big-O factors and deals with actual
> numbers and real world hardware, QVector becomes much better than QList
> as a _general purpose_ sequential container. Emphasis on the general
> purpose, please.

For me, the best general purpose container is the one that makes it hardest 
to run into big performance bottlenecks. The point being that it should be 
GENERAL purpose, i.e., work efficiently for as many use cases as possible, 
even if it requires compromises for some common ones. So an O(n) container 
is better than an O(mn) one, even if it is often marginally slower. And a 
container with prepend optimization is better than one without it.

I used array-of-pointer data structures almost exclusively even in plain C 
code and before Qt even introduced QList as it stands now. I find QList's 
API that hides the pointer dereferences, the allocations to hold the value 
copies, etc. (i.e., all the tedious parts of array-of-pointer 
implementation) from me extremely useful and will be sad to see it go away.

> Hence, we want Qt to move away from QList (and encourage users to do the
> same). The point of this thread, once more, was asking how to do that as
> painlessly as possible.

And my answer is that the only painless way is to just not do that to begin 
with. No matter how you remove or change QList, it will break lots and lots 
of existing code and take away a useful API from new code.

>> For the "unnecessary" part, because Qt has been working fine without
>> QString SSO for years.
> 
> Nice try: https://en.wikipedia.org/wiki/Appeal_to_tradition

This is not an appeal to tradition, but to practical experience. QString is 
working well in thousands of software packages now. Of course, that doesn't 
imply that it is not possible to do better, in theory. But is this 
optimization worth the trouble of breaking the ABI? (Keep in mind that my 
argument is that Qt 6 should either be source&binary-compatible with Qt 5 
(with only the CMake build system as the change justifying the major 
version) or not exist at all (i.e., be called 5.13 instead). Of course, if 
you are breaking the ABI anyway, then one ABI breakage more or less won't 
matter.)

> Second, string classes in all major C++ libraries and frameworks are
> deploying SSO *because* it is a performance win.

SSO is a clear performance win for std::string because std::string is NOT 
CoW (in fact, g++'s implementation used to be, but it was dropped because 
some obscure "clarification" in C++11 is interpreted as forbidding it). So 
you have to deep-copy anyway, and SSO saves the allocation. But SSO bypasses 
CoW, so is only a win on the current architectures with ridiculously slow 
atomics and fast bulk copies. If atomics ever get optimized in a new CPU 
architecture, you'll wish the old QString back.

>> For the "probably also counterproductive" part:
>> * Because there are surely architectures or environments where copying
>> 256 bytes (or whatever the SSO max length actually is)
> 
> This is a straw man argument, specifically an exaggeration.
> 
> At some QtCS Thiago was talking about 23-24 QChars, i.e. 48 bytes, plus
> a couple of pointers or so, to bring it to 64 bytes (~ a cacheline).

"or whatever the SSO max length actually is". So 64 bytes it is. Still 8 
times more than before. This is only faster because current architectures 
suck at atomics.

>> * Because the total memory use for an array of QString will likely be
>>    higher, due to the padding (space reserved for SSO)?
> 
> ... or much, much, much lower because you don't have to allocate every
> string's payload separately.

This depends on the efficiency of your platform's malloc/new, and on your 
reservation strategy (do you optimize for memory use and allocate the exact 
string size initially or do you immediately reserve space for fast append?). 
Of course, if your platform allocator always pads to 64+ bytes, then no 
matter what you do, the memory use of non-SSO QString will be the higher 
one. This is platform-dependent.

> This thread was about managing API breaks. Adding SSO to QString is not
> meant to be an API break (*). Please stop derailing the thread.
> 
> (*) Emphasis on _meant_, because obviously it yields observable side
> effects.

This thread was also about managing ABI breaks, which QString SSO definitely 
is.

        Kevin Kofler