[Development] RFC: Proposal for a semi-radical change in Qt APIs taking strings

Thiago Macieira thiago.macieira at intel.com
Wed Oct 14 20:28:04 CEST 2015


On Wednesday 14 October 2015 20:04:12 Marc Mutz wrote:
> On Wednesday 14 October 2015 18:11:26 Thiago Macieira wrote:
> > and the fact that QStringLiterals don't share will cause the
> > innocent-looking  above code require 64 bytes of read-only data.
> 
> They are shared, because it seems that lambdas within the same function have
> the same type. At least last I checked, that was what GCC implemented.

GCC 5.2, 6: 2 lambdas, data duplicated
Clang 3.7, 3.8: 2 lambdas, data duplicated
ICC 16: 2 lambdas, data duplicated

You can see from the disassembly that they are two different types.

> >         movq    _ZN10QArrayData18shared_static_dataE at GOTPCREL(%rip), %rax
> 
> And you want the nullptr to get rid of this relocation.

Yes, but more importantly because it speeds up the check for when reference 
counting should be done. Right now, it needs to check bit 9 inside d->flags, 
which means dereferencing the pointer (hitting another cacheline) and the 
compiler never knows that test is constant with QStringLiterals.

With a null pointer, the check is very trivial (a TEST instruction, for both 
the null and the ~1 check) and the compiler should be able to optimise the 
destructor away.

Here's the entire function, as it is today with one QStringLiteral only:
(compiled with GCC 6 -fno-exceptions, rearranged/edited for clarity)

	; load the literal:
        movq    _ZN10QArrayData18shared_static_dataE at GOTPCREL(%rip), %rax	; d
        movl    $3, 16(%rsp)		; str.d.size = 3
        movq    %rax, (%rsp)		; str.d.d = &QArrayData::shared_static_data
        leaq    .LC0(%rip), %rax	; u"foo"
        movq    %rax, 8(%rsp)	; str.d.b = u"foo"
	; make the call:
        movq    %rsp, %rdi
        call    _Z1fRK7QString at PLT
	; inlined QString::~QString
        movq    (%rsp), %rax		; reload the d pointer
        testl   $512, (%rax)		; d->flags & QArrayData::ImmutableHeader
        je      .L8
        addq    $40, %rsp
        ret
	; this is the dead code, it never gets run:
.L8:
        lock subl       $1, 4(%rax)	; d->ref_.deref()
        jne     .L5
        movq    (%rsp), %rdi		; load d pointer
        movl    $16, %edx		; alignof(QTypedArrayData<QChar>)
        movl    $2, %esi			; sizeof(QChar)
        call    _ZN10QArrayData10deallocateEPS_mm at PLT
        addq    $40, %rsp
        ret

A hacky implementation that uses a null pointer instead:

	; load the literal:
        leaq    .LC0(%rip), %rax	; u"foo"
        movq    $0, (%rsp)		; str.d.d = nullptr
        movq    %rax, 8(%rsp)	; str.d.b = u"foo"
        movl    $3, 16(%rsp)		; str.d.size = 3
	; make the call
        movq    %rsp, %rdi
        call    _Z1fRK7QString at PLT
        addq    $40, %rsp
        ret

The QString::~QString destructor expanded to empty with GCC. Unfortunately, 
Clang and ICC retained the check (they must be assuming the callee modified the 
const parameter).

Unfortunately, if I change the isStatic to check for LSB set for the SSO case, 
even GCC gets thrown off and brings back the dead code.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center




More information about the Development mailing list