[Development] Proposal for an efficient and robust (de)serialization mechanism working with Qt-supported data

Fri Aug 30 17:12:03 CEST 2019

Hi Simon,

I actually plan to attend QtCS and trigger a discussion on this as the topic is crossing many lines.
In the meantime, in order to move forward, I will submit a codereview with just the core of QBind, focusing on how to offer more flexibility & efficiency in writing C++ data into Json+Cbor.

The question you are raising when talking about schemas is: "where do I look for information about the structure of the data?" (typically when reading...)
- For QDataStream, the answer is: "into the code"...  but which version of "the code" ? QDataStream::version() evolves at Qt's pace, so as a Qt user, it may not be enough to know how to read your own data from one machine to another
- For protobuf and the like: "into an external schema file" which ensures you cannot misinterpret the binary data and allow gradual changes to the schema... but you have to interpret the schema using external libs or in the case of protobuf compile the schema into executable code to start reading data
- For Xml: some of the structure can be guessed from markup but it is more "open to interpretation" and you will probably need an external schema or additional information to make sense of the data
- For Json, Cbor: most structure is made explicit by the standard, so even a third-part may be able to do a lot with your data, although its semantic may remain obscure if maps/objects do not use meaningful names

As a result, it is usually easier (and safer!) to deal with explicitly structured data as Json and Cbor, and that is what I would usually recommend.
Even though QBind cannot beat QDataStream/protobuf on raw write performance, QBind can be very close to them when writing Cbor.

Compared to QDataStream::operator<< and >>, QBind requires the code to be more descriptive about the data behind a C++ type (sequence of items or record of named items).
It can then translate it in explicitly structured data formats like Json, Cbor, Xml, Tables, etc. or just ignore it in implicitly structured data formats like QDataStream.

Compared to protobuf, QBind replaces external schema files + tool with a C++ fluent interface that is akin to an embedded DSL, just, a very trivial one.
So it does not need external tools (but does not help reading the data in other languages). And it allows choosing the adequate data format according to the requirements: binary for performance, explicit text-format for interoperability with other languages, XML for the toolset, what else?

Hope it helps thinking about the various pros/cons involved...

Cheers,
Arnaud

From: Simon Hausmann <Simon.Hausmann at qt.io<mailto:Simon.Hausmann at qt.io>>
Sent: vendredi 30 août 2019 15:43
To: Arnaud Clere <arnaud.clere at minmaxmedical.com<mailto:arnaud.clere at minmaxmedical.com>>; Edward Welbourne <edward.welbourne at qt.io<mailto:edward.welbourne at qt.io>>
Cc: development at qt-project.org<mailto:development at qt-project.org>
Subject: Re: Proposal for an efficient and robust (de)serialization mechanism working with Qt-supported data

Hi Arnaud,

I think that perhaps this is also a topic worth discussing at the Qt Contributor Summit, if you can attend. During a face-to-face discussion we may be able to find a good understanding more efficiently of what exactly it is that we need in Qt.

I'm interesting in discussing how we can get away from QDataStream towards something more schema oriented. Perhaps that involves an intermediate abstraction similar to what you're proposal.

Simon

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20190830/757b1613/attachment.html>