[Development] RFC: Deprecating setSharable / isSharable in our containers and QString

Thiago Macieira thiago.macieira at intel.com
Fri Feb 21 22:21:01 CET 2014


Em sex 21 fev 2014, às 12:49:08, Thiago Macieira escreveu:
> 1) keep a non-null d for an empty-but-not null object (size == 0)
>   a) d points to a read-write section of memory where we can ref up and down
> drawback: slowness due to atomic operations in multiple cores
> 
>   b) d points to another special value, like d = 0x1
>         bool isStatic() const { return (quintptr(d) & ~1) == 0; }
>         /// Returns false if deallocation is necessary
>         bool deref() { return isStatic() || d->deref(); }
> 
> 2) keep a null d for static empty, but a non null begin pointer
>         bool isNull() const { return d == nullptr && b == nullptr; }
>         bool isEmpty() const { return size == 0; }
>         T *constData() const { return b; }
> 
> drawback: constData() would return nullptr for a null object, which would 
> allow people to start relying on that.

Uh... one minute after sending this, while replying to Marc, I realised that 
option 2 won't work. We need to have a state for raw data that doesn't do 
reference counting either. Raw data needs to have a b != nullptr, so we need 
to store the "staticness" in the d pointer anyway.

A quick check shows that, on x86, on ARM, and on PowerPC, there's no difference 
in the number of instructions for:
	bool isStatic() const { return (quintptr(d) & ~1) == 0; }
and
	bool isNull() const { return !d; }

One expands to:
        testq   $-2, %rax		(x86-64)
        bics    r2, r3, #1		(ARMv7-A)
        rlwinm. 10,9,0,0,30		(PPC)
The other to:
        testq   %rax, %rax		(x86-64)
        cmp     r3, #0			(ARMv7-A)
        cmpwi 7,9,0			(PPC)

On MIPS, it requires two extra instructions.
        li      $3,-2                   # 0xfffffffffffffffe
        and     $3,$2,$3
        beq     $3,$0,$L12
versus:
        beq     $2,$0,$L12

Same for IA-64: it requires an extra instruction:
        and r15 = -2, r14
        ;;
        cmp.eq p6, p7 = 0, r15
versus:
        cmp.eq p6, p7 = 0, r14

And for AArch64:
        cmp x1, #2
        b.lo .LBB0_3
versus:
        cbz     x1, .LBB0_3

All tests performed using GCC 4.8, except for the AArch64 one (used Clang 
3.4). Comparing the ARMv7 code from GCC and from Clang, GCC generated much 
better code.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center




More information about the Development mailing list