[Development] QList

Wed Mar 29 17:28:00 CEST 2017

On 2017-03-29 16:41, Matthew Woehlke wrote:
> On 2017-03-29 07:26, Marc Mutz wrote:
>> That brings us straight back to the fundamental question: Why can the 
>> C++
>> world at large cope with containers that are not CoW and Qt cannot? 
>> The only
>> answer I have is "because Qt never tried". And that's the end of it. I 
>> have
>> pointed to Herb's string measurements from a decade or two ago. I have 
>> shown
>> that copying a std::vector up to 1K ints is faster than a QVector, 
>> when
>> hammered by at least two threads.
> 
> 4 KiB of memory is not very much. What happens if you have larger
> objects (say, 100 objects with 96 bytes each)?

The same. QVector has a hw mutex around the ref counting. Only one core 
can have write access to any given cache line. So the rate with which 
you can update the ref count is limited by the rate a single core can 
update it (in memory), divided by a factor that accounts for cache-line 
ping-pong. It can be as high as 2.

Deep-copying does not write to the source object, and any number of 
cores can share read access for a given cache line, each with its own 
copy, so deep-copying scales linearly with the number of cores.

Therefore, for any given element size and count there exists a thread 
count where deep-copying becomes faster than CoW. Yes, even for 1K 
objects of 1K size each.

> What if you have an API that needs value semantics (keep in mind one
> benefit of CoW is implicit shared lifetime management) but tend to not
> actually modify the "copied" list?

std::vector has value semantics. OTOH, QVector's CoW leaks its reference 
semantics, e.g. if you take an iterator into a container, copy the 
container, then write to the iterator, you wrote to both copies.

> What benchmarks have been done on *real applications*? What were the
> results?

What benchmarks have *you* done? The world outside Qt is happily working 
with CoWless containers. It's proponents of CoW who need to show that 
CoW is a global optimisation and not just for copying of certain element 
counts and sizes.

>> (I just had to review _another_ pimpl'ed class that contained
>> nothing but two enums)
> 
> ...and what happens if at some point in the future that class needs
> three enums? Or some other member?

When you start with the class, you pack the two values into a bit-field 
and add reserved space to a certain size. 4 or 8 bytes. When you run 
out, you make a V2 and add an overload taking V2. That is perfectly ok, 
since old code can't use new API. This doesn't mean you should never use 
pimpl. But it means you shouldn't use it just because you can.

> What, exactly, do you find objectionable about PIMPL in "modern C++"? 
> It
> can't be that it's inefficient, because performance was never a goal of
> PIMPL.

Performance is always a goal in C++. Even in Qt. Otherwise QRect would 
be pimpled, too.

>>> so I can't pass it by value into slots.
>> 
>> Why would you want to? No-one does that. People use cref, like for all 
>> large
>> types. Qt makes sure that a copy is taken only when needed, ie. when 
>> the slot
>> is in a different thread from the emitter. That is very rare, and 
>> people can
>> be expected to pass a shared_ptr<vector> instead in these situations.
> 
> This (passing lists across thread boundaries in signals/slots) happens
> quite a bit in https://github.com/kitware/vivia/. Doing so is a
> fundamental part of the data processing architecture of at least two of
> the applications there.

Qt supports thousands of applications. We shouldn't optimize for 
corner-cases.

> Also, explicit sharing borders on premature pessimization. If my slot
> needs to modify the data, I have to go out of my way to avoid making an
> unnecessary copy. (This argument would be more compelling if C++ had a
> cow_ptr.)

You got that the wrong way around.

Thanks,
Marc