[Development] QJsonPrivate::Parser::parseObject broken on big endian

Thu Sep 13 13:23:16 CEST 2012

On Sep 13, 2012, at 12:56 PM, ext Konstantin Tokarev <annulen at yandex.ru> wrote:

> 03.09.2012, 15:21, "Olivier Goffart" <olivier at woboq.com>:
> 
>>  On Sunday 02 September 2012 23:10:18 Konstantin Tokarev wrote:
>>>   Hi all,
>>> 
>>>   When building Qt 5 on big endian host (PPC) I've found moc breaking on Qt
>>>   classes containing Q_PLUGIN_METADATA with
>>> 
>>>   ASSERT: "idx >= 0 && idx < s" in file
>>>   ../../../include/QtCore/../../src/corelib/tools/qvarlengtharray.h, line 111
>>> 
>>>   It turned out to be a fault of QJsonPrivate::Parser::parseObject which has
>>>   different code for handling of big endian and little endian cases:
>>> 
>>>       if (parsedObject.offsets.size()) {
>>>           int tableSize = parsedObject.offsets.size()*sizeof(uint);
>>                                                       ^^^^^^^^^^^^^
>> 
>>  The error is there:  one should multiply tableSize by sizeof(uint) only if one
>>  do a memcpy.
>>>           table = reserveSpace(tableSize);
>>>   #if Q_BYTE_ORDER == Q_LITTLE_ENDIAN
>>>           memcpy(data + table, parsedObject.offsets.constData(), tableSize);
>>>   #else
>>>           offset *o = (offset *)(data + table);
>>>           for (int i = 0; i < tableSize; ++i)
>>>               o[i] = parsedObject.offsets[i];
>>> 
>>>   #endif
>>>       }
>>> 
>>>   Could anyone explain why memcpy cannot be used for big endian case here?
>>  I guess that's because the offsets needs to be stored in little endian.
>>  offset is a typedef to a class that has an assignement operator which swap the
>>  bytes.
> 
> It turned out to be not the only place where memcpy is working incorrectly.
> Unit test segfaults on memcpy in Value::copyData 
> 
> 404     case QJsonValue::Array:
> 405     case QJsonValue::Object: {
> 406         const QJsonPrivate::Base *b = v.base;
> 407         if (!b)
> 408             b = (v.t == QJsonValue::Array ? &emptyArray : &emptyObject);
> 409         memcpy(dest, b, b->size);
> 410         break;
> 411     }
> 
> I tried to understand why memcpy fails here and how to replace it for BE case,
> but still don't have a clue. I'm also quite disappointed to see so heavily endian-
> dependent code written nowadays.

That's what you have to do when doing binary formats unfortunately. I think the main problem is that we currently don't have a big endian machine at hand for testing. I've tried to make it as endian independent as possible already, but without being able to test you're bound to have bugs in such code.

As to why this would fail: The only reason I can see is if either dest is invalid, or b->size contains garbage.

Cheers,
Lars