[Development] leak in QMetaObject?
edward.welbourne at qt.io
Wed Jul 20 11:39:00 CEST 2016
Op 20/07/2016 om 10:41 schreef Olivier Goffart:
>> The distribution does not matter. If it can be big, quadradic
>> complexity can be a blocker.
André Somers replied
> Nonsense. There is no need to pessimize the frequent cases to cater
> for avoiding a performance issue in an exceptional case.
Well, it depends ...
You've got some function, F, from number of connections to cost of any
given way of handling connections. You've got some distribution P over
values n takes. Your total cost is the sum of P(n).F(n) over that range
of values. The big-O discussion is usually about how F varies; but what
actually matters is the sum of P(n).F(n).
If your distribution has a long tail (so sum(P(n); n > N) dies off
slowly as N grows) and F(n) grows fast, it is quite easy for
sum(P(n).F(n)) to be dominated by its contributions from large n, even
if they are rare, as long as there are enough of them to - when scaled
by their relatively huge F(n) - add up.
The frequency distribution P effectively acts like the constant of
proportionality that your big-O notation ignores; if that constant is
large for an O(n) algorithm but small for an O(n^2) algorithm, n has to
be large compared to the ratio between those two constants of
proportionality before O(n^2) is worse than O(n). If one in a thousand
of your population has n ~ 1k then an O(n^2) algorithm has those costing
as much as all the (vastly more common) cases with n ~ 1.
So optimising for the rare case (which may indeed make the common case
more expensive) can indeed be worth it, because the rare case can end up
making a disproportionate contribution to your total costs. You have to
know your distribution before you make up your mind about this - and
dumb statistics like mean and median are no substitute for the actual
distribution, for all that they're better than no data. Long tails matter.
More information about the Development