[Development] QDataStream: blackbox or document all versions?

Mon Sep 26 12:29:07 CEST 2016

On Sep 26, 2016, at 11:34, Simon Hausmann <Simon.Hausmann at qt.io<mailto:Simon.Hausmann at qt.io>> wrote:

Hi,

I'm very much in favor of using a proper schema based system such as protocol buffers if we decide
to remove the black box from serialization. They don't appear to be connected, but the moment you
need to deal with changes in the format, the protobuf approach wins IMO. The other advantage is interoperability
with the world outside of Qt.

There are many alternatives to protocol buffers, some of them faster and/or more compressed.  https://capnproto.org/ claims to be a really good one, for example.  It has an MIT license.  Zero-copy is a nice feature to have.  After what I read, I’d probably never choose to use protocol buffers if it can be avoided.  You have to use the code generator, and it’s not efficient.  But I never tried, either.

Being able to mmap the file and immediately treat it as the data structure that you really wanted is a nice feature to have; that kind of implementation is not necessarily the same thing as a serialization protocol, although the ideal serialization protocol could be designed so that you can deal with it either way.  Doesn’t fit the QDataStream API, anyway.

Wikipedia has a comparison:  https://en.wikipedia.org/wiki/Comparison_of_data_serialization_formats

There are even several that claim to be both binary and human-readable.  i always thought that a binary format which doesn’t require a schema, uses tags like XML, but can represent the common data types natively (various types of numbers, bools, enums, opaque byte arrays, and strings) and for which a tool is available to format it to be human-readable ought to be better than things like XML and JSON.  An early example is wbxml: WAP Binary XML (but it doesn’t have any representation for floating-point numbers AFAICT).  I also wrote one which uses a string table for the tags, and binary representation for the tree structure and the data.  But then I realized maybe I should have designed it for mmapping rather than only for bytewise streaming.  But doing alignment wastes some space.

If you like schemas or IDL, there’s the DDS serialization protocol.  We’ve already tossed around the idea of having a Qt DDS wrapper because it would be useful in certain known industries.  But it requires more structure and discipline compared to tag-based formats.  It resembles CORBA because it comes from OMG.  Like CORBA, it’s not worth the effort for rapid prototyping, only for larger-scale projects where the need for robustness outweighs the need to have a short development effort.

QDataStream is just a sequence - you have to know what to expect when you are deserializing, rather than checking tag names or using a schema.  Unless you just have a convention that the stream will consist of alternating tags and values.

Is it actually that useful?  Maybe we should deprecate it and come up with something better?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20160926/b37630cb/attachment.html>