[Development] qsizetype

Mon Sep 19 20:51:11 CEST 2022

On Tue, Sep 13, 2022 at 01:12:57PM +0000, Volker Hilsheimer wrote:
> > On Wed, Sep 07, 2022 at 06:38:30PM +0200, A. Pönitz wrote:
> >> [...] What would have been wrong with starting with
> >> 
> >> #ifdef I_AM_WORKING_ON_IT using qsizetyp_ = qsizetype; #else using
> >> qsizetyp_ = int; #endif
> >> 
> >> then have the people working on it (and only those, plus perhaps
> >> potential early adopters) define the macro locally, "port" int to
> >> qsizetyp_, and when everyone is happy with the scope and the
> >> implication ofthe change, at 7.0 time, globally replace qsizetyp_
> >> by qsizetype ?
> >> 
> >> Why is all this done as operation at an "open heart" instead of
> >> having a "staging" and "production" setup?
> > 
> > 
> > Could anyone involved in the decision making that resulted in the
> > approach taken here please comment?
> 
> 
> I can't claim that I was involved in the decision making,
> but here’s how I see it:

I appreciate this, albeit I believe it would possibly help to
correct my understanding of the matter if the actual reasoning on
cost and benefits were given by someone actually involved, not just
our combined best guesses.

Some comments (please ignore / handle the ones in [...] in separate
threads to not dilute this here further)

> We have the tools to change - with some limitations - API signatures
> without breaking either source or binary compatibility.

[The introduction of first overloads to functions - and that's one
core 'tool' here - /is/ source-incompatible. That this particular
kind of source-incompatibility was declared "harmless" doesn't
really change the fact.]

> deprecate and “weaken” old overloads in favour of new overloads; or we
> can remove the old overload completely from the public API and still
> continue to export the old symbol through the module-specific
> ‘removed_api.cpp’ files.

> This is conceptually great news, it gives us a bigger toolbox than
> what we had before. Technically, this is very powerful and useful,
> allowing us to fix mistakes gradually, while giving users control over
> what kind of deprecation warning level they want (from completely
> silent, up to code no longer compiling).

[It also means that users are exposed over a lengthy period of time
to these "harmless" source incompatibilities]

> This is IMHO superior to a temporary type alias: A string-based
> signal/slot connection where the signal has been ported ot emit a
> qsizetyp_ while the slot still receives int will fail. So that would
> break source compatibility. But if both slot overloads are still
> visible for moc when Qt is built, but not to the compiler when Qt is
> used, then those connections will continue to work.

[That's a good point, and something that didn't appear to me as a
problem so far, possibly because I have only a few string-based
connects left.  And you are right, this /additional/ incompatibility
hits the users with the current approach only at Qt 7 time, whereas
the temporary alias would hit "as we go". So basically the known
options are not "compile errors as we go plus connect errors at one
time" vs "connect errors as we go plus compile errors at one time".

That's nothing I'd really like to discuss here deeper, because I
actually believe that neither is anywhere close to a good solution,
see below]

> So, I think we have the right tools. The discussion we need to
> have is when to use them. As I have proposed in this thread: this
> has to be a case by case decision.
> 
> QTimer should allow timeouts longer than 2^31 msecs, i.e. < 25
> days.  Great that we could fix this before Qt 7.

[I agree, but this one is completely unrelated to the size of
qsizetype]

> QDir::count and operator[] now work with qsizetype. I suppose there
> can be >2^31 files in a directory, perhaps more so in 10 years than
> now.
> Nevertheless, I do wonder whether this is worth the potential
> source compatibility breakage that is pointed out in the comment
> message. But as long as users need to opt into deprecation warnings
> explicitly, that is ok as well (and would be a “staging" and
> “production" setup, in practice).

[Latest at Qt 7 the user has /no choice/...]

I'd like to go back to the beginning of this thread and collect/reword
a few items:

0. To get this out of the way: If all this were about a completely new
codebase, I would see no reason to /not/ use some qsizetype with
sizeof(qsizetype) == sizeof(size_t) and this discussion would not exist.

I also agree that sizeof(qsizetype) == sizeof(size_t) has (limited...)
benefits in interacting with other bits of code, most notably the 
C++ Standard Library.

Consequently, I /do/ consider changing to sizeof(qsizetype) ==
sizeof(size_t) a 'Nice to have' feature. I "just" do not see the effort
necessary to go there anywhere close to even remotely justifiable by
at least two orders of magnitude(!) in any metrics I'd consider appropriate
in the current context.

1. Having a consistent 'main' size type in an application is important,
anything else is a pain for creation and maintenance. 

2. Thanks to implicit sharing, Qt containers are pretty convenient, for
small and medium work sets. In a GUI application one typically does not
have to bother about "unneeded" copies (yes, I hear someone cringe,
doesn't change the fact). The price for that is comparatively expensive
(alternative: incovenient) write access, which as a result makes them
unlikely candidates for data storage in number crunching, which in turn
is often related to 'large' containers.

With 1 & 2 I am now at "Nice to have, but not really necessary"
(a.k.a  P3 or P4 in JIRA).

3. Qt 5 was fairly consistently using 'int' all over the place,
exceptions could well explained by real needs (e.g. file
sizes /truly/ exceeding 2GB), the affected code areas were 
comparatively rarely used and fairly self-contained.

The qsizetype change for Qt container sizes in 6.0 affects large
parts of Qt and similarly large parts of applications using Qt
in earnest. These changes are /not/ self contained.

In regard to Qt itself I believe Marc refers to this problem in the
initial mail in this thread when he states:

    "I do observe, though, that, starting with the premiss of a Qt
    container of size > 2 Gi, the current code base has a pretty high
    bug density in this area:"

        - https://codereview.qt-project.org/c/qt/qtbase/+/...
        - https://codereview.qt-project.org/c/qt/qtbase/+/...
        - https://codereview.qt-project.org/c/qt/qtbase/+/...
        - https://codereview.qt-project.org/c/qt/qtbase/+/...
        - https://codereview.qt-project.org/c/qt/qtbase/+/...
        - https://codereview.qt-project.org/c/qt/qtbase/+/...

I do not quite agree with the characterization of these problems as bugs
(mostly because I think using Qt containers of this size is ill-advised
and consequently does not happen "in reality", see above) but do I agree
that the change introduced a large number of "code smells" that
ill-affects at least the /perceived/ quality of the code base.

Dropping the ball /right here/ basically gives the worst we can get:
Losing the benefits of having one uniform size type in the code base,
creating friction/"impedance" issues within Qt itself and any Qt using
code base, and all that while not reaping any of the (little, if
any...) benefits of an all-sizeof(qsizetype)==sizeof(size_t) code base.

In the current state, we are way /below zero/ on "benefits minus costs".

Now Marc advises to move on, because

  "I don't really want to start a discussion on whether (owning) Qt
   containers _should_  support more than 2 Gi elements in the first place.
   From my pov, that decision has been made in the run-up to 6.0, and
   everyone who knows me knows that I don't much care about the owning Qt
   containers and their shortcomings."

That is a way to approach it, but obviously - also from this statement -
not the only one.

For one I believe part of the initial discussion - for which I btw
can't find any on-line reference anymore - was actually to keep
qsizetype flexible, i.e. re-definable at configure time by the user.
[Am I dreaming here?]

Second, the changes made in Qt 6 are still covering just a very
small fraction of what an "all qsizetype" code base look like.
Everyone can access public gerrit and git and come up of own guesses
from the proliferation/counts of 'qsizetype' and 'int' in the
source. My best guesstimate that we are safely below 10%, very
likely even below 1%, of the necessary effort _for Qt alone_.
Please check the changes that went in so far and the resulting code
base before shouting at me. 

I believe it's a safe bet that at the current rate there's no hope
to finish a transition at Qt 7 time (i.e. "in a handful years"(!))
while nevertheless binding a significant amount of Qt core
development resources (and that already with the SiC implications on
users as mentioned before). At the end of this ordeal we'd have 
_one_ "Nice to have" issue fixed. One out of thousands.

The options I see are:

(A) Push through.

(B) Stay where we are (plus or minus a bit)

(C) Go back to qsizetype == int
     (C1) totally, 

     (C2) alternatively, using something with sizeof(size_t) for
          Q*View types, but int elsewhere.

As of now, I consider (C) the most reasonable (as in "least
undesirable") of the choices:

(A) That's the "technically best" solution, but again, the whole
"benefit" compared to (C) here is a rather theoretical one, with no real
practical impact, and this approach comes with an an absurdly high
price, both for Qt development and all Qt users.

(B) Is what apparently now is what you would prefer, but code-wise
this is worse than either (A) and (C). It incurs most of the cost of
(A) on the users while not bringing in the benefits.

(C) means effectively abandoning of the "patches we've been pushing over
the last months", but as this is in total only a small fraction of what
is needed to get to (A), so this is effectively "cutting losses",
getting us back a uniform code base with no costs for most users.

I've mentioned (C1) and (C2) as variations here because the Q*View zoo
is currently still quite separate from the rest and not much user code
is using it already, and even then in this context taking explicit sizes
is not the most common operation. So having Q*View type sizes with
sizeof(size_t) would be a "fairly selfcontained" subsystem, that can be
carved out.

Andre'