[Development] API review request: CBOR Stream reader and writer

Thiago Macieira thiago.macieira at intel.com
Mon Jan 22 10:37:40 CET 2018


On quarta-feira, 17 de janeiro de 2018 13:25:53 PST Thiago Macieira wrote:
> My current idea is a command-line tool that converts between serialisation
> formats:
>  * CBOR
>  * CBOR diagnostic notation (output only, since I won't write the parser)
>  * JSON
>  * XML
>  * Qt binary JSON
>  * Plain QDataStream (output only, since it's not self-describing)
>  * QDataStream-serialised QVariant (is self-describing)
> 
> Though, because of the conversions, this example is ideal for QCborValue,
> not the stream reader and writer.

Done: https://codereview.qt-project.org/217410
I'd really appreciate if someone reviewed my code that parses XML.

I've also used this example application to benchmark the various aspects of 
the encoder. It's helped me find a couple of bottlenecks in the 
implementation, which led to a redesign of the string parsing in 
QCborStreamReader.

The current numbers, for parsing an array with 5000 entries of a map, the 
contents of which were obtained by:
qtplugininfo --full-json /usr/lib64/qt5/plugins/akregator_config_advanced.so

Binary JSON validating:
         97,003559      task-clock:u (msec)
       237.092.857      cycles
       437.005.872      instructions
[15.2% was spent in QIODevice::readAll, 59.1% in fromBinaryData]

JSON parsing:
        273,359723      task-clock:u (msec)
       793.297.513      cycles
     2.698.607.303      instructions
[4.7% in readAll(), 78.2% in fromJson]

CBOR parsing:
        341,311535      task-clock
       885.053.081      cycles
     2.548.803.851      instructions

The string parser is still showing up at 70.5% of the full execution time, of 
which 33.4% are in QCborStreamReader and 20.4% calling isValidUtf8(). The 
program spends 25,0% inside QIODevice, inside the string decoder. Unlike the 
JSON parser, we don't operate on a pre-read byte array, but directly on the 
QIODevice, checking for size.

The JSON parser spends 56.2% of the full execution time parsing strings.

As for the encoders, the test is done by reading from Binary JSON, converting 
to QVariant, then back from QVariant and then saving.

Binary JSON (baseline):
        724,619527      task-clock
     1.792.421.866      cycles
     2.983.986.222      instructions
Time spent in toBinaryData: 1.24%

JSON:
       1150,128441      task-clock:u (msec)
     3.179.240.094      cycles
     6.673.262.299      instructions
Time spent in the encoder: 34.5%, so ~403 ms

CBOR:
        930,697635      task-clock:u (msec)
     2.391.326.016      cycles
     4.910.714.973      instructions
Time spent in the encoder: 21.2%, so 176 ms

File sizes:
Binary JSON:	55,540,020 bytes (546 MB/s on read, 5900 MB/s write)
JSON: 	57,580,003 bytes (201 MB/s read, 136 MB/s write)
CBOR: 	41,200,002 bytes (115 MB/s read, 223 MB/s write)
-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center






More information about the Development mailing list