[Development] API review request: CBOR Stream reader and writer
Thiago Macieira
thiago.macieira at intel.com
Mon Jan 22 10:37:40 CET 2018
On quarta-feira, 17 de janeiro de 2018 13:25:53 PST Thiago Macieira wrote:
> My current idea is a command-line tool that converts between serialisation
> formats:
> * CBOR
> * CBOR diagnostic notation (output only, since I won't write the parser)
> * JSON
> * XML
> * Qt binary JSON
> * Plain QDataStream (output only, since it's not self-describing)
> * QDataStream-serialised QVariant (is self-describing)
>
> Though, because of the conversions, this example is ideal for QCborValue,
> not the stream reader and writer.
Done: https://codereview.qt-project.org/217410
I'd really appreciate if someone reviewed my code that parses XML.
I've also used this example application to benchmark the various aspects of
the encoder. It's helped me find a couple of bottlenecks in the
implementation, which led to a redesign of the string parsing in
QCborStreamReader.
The current numbers, for parsing an array with 5000 entries of a map, the
contents of which were obtained by:
qtplugininfo --full-json /usr/lib64/qt5/plugins/akregator_config_advanced.so
Binary JSON validating:
97,003559 task-clock:u (msec)
237.092.857 cycles
437.005.872 instructions
[15.2% was spent in QIODevice::readAll, 59.1% in fromBinaryData]
JSON parsing:
273,359723 task-clock:u (msec)
793.297.513 cycles
2.698.607.303 instructions
[4.7% in readAll(), 78.2% in fromJson]
CBOR parsing:
341,311535 task-clock
885.053.081 cycles
2.548.803.851 instructions
The string parser is still showing up at 70.5% of the full execution time, of
which 33.4% are in QCborStreamReader and 20.4% calling isValidUtf8(). The
program spends 25,0% inside QIODevice, inside the string decoder. Unlike the
JSON parser, we don't operate on a pre-read byte array, but directly on the
QIODevice, checking for size.
The JSON parser spends 56.2% of the full execution time parsing strings.
As for the encoders, the test is done by reading from Binary JSON, converting
to QVariant, then back from QVariant and then saving.
Binary JSON (baseline):
724,619527 task-clock
1.792.421.866 cycles
2.983.986.222 instructions
Time spent in toBinaryData: 1.24%
JSON:
1150,128441 task-clock:u (msec)
3.179.240.094 cycles
6.673.262.299 instructions
Time spent in the encoder: 34.5%, so ~403 ms
CBOR:
930,697635 task-clock:u (msec)
2.391.326.016 cycles
4.910.714.973 instructions
Time spent in the encoder: 21.2%, so 176 ms
File sizes:
Binary JSON: 55,540,020 bytes (546 MB/s on read, 5900 MB/s write)
JSON: 57,580,003 bytes (201 MB/s read, 136 MB/s write)
CBOR: 41,200,002 bytes (115 MB/s read, 223 MB/s write)
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
More information about the Development
mailing list