[Development] Changing container privates again

Mon Jun 11 15:31:26 CEST 2012

On segunda-feira, 11 de junho de 2012 11.02.24, João Abecasis wrote:
> > struct Q_CORE_EXPORT QArrayData
> > {
> > 
> >    QtPrivate::RefCount ref;
> >    uint flags;
> >    int size;       // ### move to the main class body?
> >    // -- 4 bytes padding here on 64-bit systems --
> >    qptrdiff offset; // in bytes from beginning of header
> >    // size is 16 / 24 bytes
> > 
> > };
> 
> The most often used elements have to be offset and size. The ref-count is
> used in copying and detaching, as typically will be the flags.

Agreed.

> As long as we're doing more open-heart surgery I would support moving offset
> and size to the beginning of the struct, to try to avoid padding holes.

I don't think moving them around will change anything. It won't make the code 
any faster and it won't avoid a padding hole. The structure contains an odd 
number of 32-bit entries, so it requires one 32-bit padding somewhere to add 
up to a multiple of 64 bits.

> If we are to inline any data in QVector/QString/QByteArray/QList themselves,
> offset and size should go together, which would potentially make
> sizeof(QContainer) == 3 * sizeof(void*), up from sizeof(void*). Read-only
> operations wouldn't touch the header or the reference count at the cost of
> copies and, well, sizeof(QContainer).

I agree. Moving just the size wouldn't help much. It would optimise isEmpty(), 
but not anything else.

If we move the offset, then I see no point in keeping it an offset. I'd just 
keep a regular pointer. It's an offset only so we could keep the Data header 
relocation-free when saving in a static variable.

For a month or two, QStringLiteral returns a QString, not just a QStringData. 
So this surgery is possible.

Note also that if we move both, then we finally have cheap substrings, since 
the pointer to the beginning of the data and the size are kept separate from 
the reference counter. Two different QStrings could have pointers to the same 
data and also share the reference counter.

> > struct QArrayAllocatedData : public QArrayData
> > {
> > 
> >    uint alloc;
> >    // 4 bytes tail padding on 64-bit systems
> >    // size is 20 / 32 bytes
> > 
> > };
> > 
> > struct QArrayForeignData : public QArrayData
> > {
> > 
> >    void *token;
> >    void (*notifyFunction)(void *);
> >    // size is 24 / 40 bytes
> > 
> > };
> 
> Following Olivier's suggestion I think it might be beneficial to have alloc
> unconditionally in the base class, even if it is not being used at times.
> If nothing else, it saves padding. Additionally, deleter function and token
> would be more generally useful if they can be used with the alloc member.

Yes, if we move the alloc member to the main struct, then it will simply 
occupy the space currently used for padding. The total overhead for an 
allocation would be 24 bytes on 64-bit systems. Since glibc's allocator is 16-
byte aligned, this means our data is always 8 bytes off.

In my opinion, that's actually worse than having a larger header.

> Say, you allocate a buffer, you want some QContainer to manage it *and*
> change it in place as necessary *and* free it when done. This is similar to
> the WebKit use case, but here we don't assume an external ref-count so
> ref-count of 1 means the data can be changed.

[see discussion below]

> > The changes are:
> > - instead of a single bit of flags, there's a full integer now (32 bits),
> > which allow us to to more things.
> > 
> > - I've reserved 3 of those bits to indicate the type of array: raw data
> > (like QStringLiteral and fromRawData), heap-allocated data, foreign data.
> I want to clearly discuss what those bits or flags will mean and their exact
> semantics. These are my suggestions:
> 
>     - CapacityReserved (taking over from the current bit)
>     - Mutable (taking over from alloc != 0, frees alloc member to be reused
> when data is immutable) 

Sure, that's doable.

>     - Notify/Deleter (new functionality)

Ok, I see. You're splitting the deleter from the immutability. Since we do 
have a flag for immutability, we can serve WebKit's use-case, but also do in-
place modifications.

I find that a nice use-case.

You're going further and suggesting that we could also tell the container how 
much memory was reserved externally, so it could grow around without 
reallocating. I think that's your reason for keeping the alloc member together 
with the notify function.

However, I don't think that's going to be a very useful use-case. I already 
think that in-place modification of externally-allocated data is a stretch use-
case, but one that could be eventually useful. Allowing the container to use 
memory around that is too much, IMHO.

> > - foreign data is like fromRawData: it comes from somewhere else, we don't
> > manage its memory and it's immutable. The difference is that we'll call a
> > notify function (see above) when our refcount drops to zero. This allows,
> > for example, to use QString on data managed by WTF::String or UTF-16 data
> > stored in an mmap'ed page.
> 
> Let's make "foreign" and immutable orthogonal.

Agreed.

I'll prepare a change.

> > - the "alloc" member is moved from the standard data to the allocated data
> > (it makes sense). This member should only ever be accessed in non-const
> > functions (more or less, there are exceptions), usually after those
> > functions have decided that they need to allocate or reallocate memory.
> 
> Let's keep it in the base but not overload it with the "mutable bit"?

I've already de-overloaded it. My current code checks the array type for 
immutability. I'll change that to a new flag.

But I still don't agree with moving the alloc member back to the main class. 
See above and below.

> > - as a drawback, the header overhead in memory allocation increases 4
> > bytes
> > on 32-bit systems and 8 bytes on 64-bit ones (due to the presence of tail
> > padding). It's possible to store data in that tail padding, though, but I
> > don't want to .
> 
> Keeping alloc in the base and moving things around sizes would be fixed at
> 20/24 bytes, with no padding. "Foreign" data would grow the header to 28/40
> bytes.

24 bytes for 64-bit is not a good idea, as I said above. The size does not 
change for 32-bit data either way.

> > - in fact, I was thinking of changing the size and alloc members to
> > size_t,
> > so this header could be used on a container supporting larger memory
> > sizes, if we wanted to, in the future.
> 
> As long as we're having this discussion, I see no reason not to make them
> unsigned, at the very least. I support size_t as well.

Changing them to size_t eats up the padding.

> > - André requested that the size member be moved away from the header into
> > the main class body itself. I haven't done that yet and I'm not sure I
> > should. It would be major surgery for something that we haven't proven
> > yet. And I don't think we have time to prove it. But it would decrease
> > the size of the header by 8 bytes on 64-bit systems.
> 
> I think it only helps if we move both offset and size together. I'm not
> convinced this is something we should be doing. (Besides now being very
> sub-optimal timing for this discussion to be happening ;-)

Agreed on all three accounts.

> > - since we stored the type of the data, we know when we allocated it.
> > Since
> > we know that and we know the size of the header, the "offset" member
> > becomes the prepend optimisation: we know how much buffer we have between
> > the end of the header and the beginning of the data.
> 
> This requires that we tie "foreign" to detached allocation and "native"
> (not-"foreign" :-p) to a single allocation. This is an assumption that I
> would like to get rid of. Perhaps we can make it hold for the constrained
> use case of QList, which offers the prepend optimization.

Yes, it ties the prepend optimisation to the "native allocation" (i.e., 
QArrayAllocatedData). Since QList cannot have foreign data anyway, it's not a 
problem for it.

Even if I agreed with your idea of letting the container expand on foreign 
data -- which I don't -- the only implication here is that it could not expand 
towards smaller memory addresses (prepend). It can only expand towards 
appending.

> > Besides all that, there are a bunch of small changes. Before we discuss
> > the
> > minor changes and implementation details (whether we need one or two
> > option
> > enums, for example), I'd like to know if anyone has more ideas relating to
> > the headers themselves.
> 
> Following what was discussed in the patches themselves, I suggest you keep
> AllocationOptions around as long as reasonable and then we'll have a much
> better idea of whether to keep them independent or ditch them ;-) But
> agreed, those are implementation details and there are still some issues
> above that need to be hammered down.

That would require rebasing and changing everything, which means spending 2-3 
hours just doing that. I don't want to do it.

I've just pushed an update that masks off the Unsharable bit.

However, I'd rather do the other way around, like we're doing for the 
immutability flag: keep a bit in the flags, which can be tested with a single 
CPU instruction (yes, even on ARM) and stop using QRefCount.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
     Intel Sweden AB - Registration Number: 556189-6027
     Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.qt-project.org/pipermail/development/attachments/20120611/14a1d3fa/attachment.sig>