[Development] Changing container privates again

André Pönitz andre.poenitz at mathematik.tu-chemnitz.de
Sat Jun 9 21:42:55 CEST 2012


On Sat, Jun 09, 2012 at 05:45:01PM +0200, Thiago Macieira wrote:
> [...] - André requested that the size member be moved away from the
> header into the main class body itself. I haven't done that yet and
> I'm not sure I should. It would be major surgery for something that we
> haven't proven yet. And I don't think we have time to prove it. But it
> would decrease the size of the header by 8 bytes on 64-bit systems.

"Request" is a bit strong here.

I asked for spending a thought or two on inlining some data like 
the size into the main structure.

The request was triggered by the observation that container using
code expands to few more instructions every now and then than one
would naively expect for "access of the n'th item in an array-like
structure".

I am sure there is room for _some_ improvement, I am not sure
whether this would be worthwhile/measurable/significant. 

My gut feeling says there is a sweet spot at 16 byte object size,
with direct pointer to the raw data, and size member and some
sensible use of the extra bits.

A "normal" QString could look like

  obj +  0 [data]  
  obj +  4 [....]
  obj +  8 [size]
  obj + 12 [encoded alloc] 

with the "data" blob

  data - 8 [whatever...]
  data - 4 [refcount]
  data + 0 [QChar 0]
  data + 2 [QChar 1]
  data + 4 [....]

A "fromRaw" QString could be something like

  obj +  0 [data]  
  obj +  4 [....]
  obj +  8 [size]
  obj + 12 [-1]

  data + 0 [QChar 0]
  data + 2 [QChar 1]
  data + 4 [....]

i.e. not require any heap allocation. As "fromRaw" is very cheap in
tis setup, it can serve as a replacement for QStringRef in some
use cases, without needing extra overloads on the consumer side.

A "proper" QStringRef in that scheme probably would have to use
obj+4 as pointer to the base string, but that fits in on 32bit
architectures only.  The 64 bit scenario doesn't go well with the
16 byte limits when it needs to cover the string ref case.  By only
allowing certain values for "encoded alloc" - like 1 - 16 byte, 2 -
32 byte, 3 - linear/slightly exponential there are a couple of
"free" bits there, but not enough for a full pointer. Even not
inlining the full alloc field would be enough.

Whether this is needed, or whether the "cheap substrings" are a
sufficient replacement for the cases where QStringRef currently
is actively used, and what the actual impact is something that
I don't know. "Spending a thought or two" doesn't seem to hurt,
though.

As a final remark: My gut feeling hasn't been updated after
the introduction of QStringLiteral, as I haven't had much
exposure to Qt 5-only code yet. QStringLiteral covers some 
of the "problematic area", maybe there's not much of an itch
worth scratching anymore.

Andre'



More information about the Development mailing list