> There might be a little catch with regards to structs containing
> members that are 8 bytes in size (aka quint64). Are those getting
> passed in registers on 32bit ARM?

struct struct_16Bytes
     unsigned long long first, second;

unsigned long long test(struct_16Bytes s)
     return s.second;

         mov     r0, r2 ; Each member fills in two registers
         mov     r1, r3
         bx      lr

If I change the signature to: "test(const struct_16Bytes &s)", then it 
         ldr     r2, [r0, #8]   ; ldrs are slow, since they read from 
         ldr     r1, [r0, #12]
         mov     r0, r2
         bx      lr

(Tested on ARM 32bit, clang -target arm-none-linux-gnueabi)

