[Development] FW: QtCS19 Serialization session

Arnaud Clère arnaud.clere at moulinarso.fr
Tue Nov 26 15:50:28 CET 2019


Now, this is the discussion I was hoping for at QtCS :-)

> -----Original Message-----
> From: Thiago Macieira <thiago.macieira at intel.com>
>
> > - it is frequent to have to save/load our types and then send/receive
> > them to some server/device using, say QSettings/QJsonDocument for the
> > first and JSON/CBOR for the latter
>
> How many do so interchangeably. I'm not asking about whether we should
> support different serialisations. I am asking how many types would benefit
> from a single, standardised mechanism for multiple backends?

The details of the serialization are usually far less important than
being able to
comply with the environment requirements :
- my company has developed a comprehensive XML toolset : whatever XML will do
(possibly with XSLT in the middle)
- my device need to exchange data : I want CBOR format but I am free to define
whatever payload suits my use case
- my desktop application has to load/save user data : whatever
QSettings flavor will do

> For example, a Rectangle in CBOR might be stored as (x1,y1,x2,y2) whereas the
> QSettings format might have been XxY+W+H, which means a generic solution
> will not work.

Truly, this kind of questions can become a standardization nightmare.
But as with all
standardization efforts, the value is not so much in the final
solution chosen than in providing
users with a common, well, "standard". Your example shows Qt evolved
in different directions wrt
serialization. Let us improve this in the future!

"@QPoint(10,20,100,150)" is a necessary facility to handle legacy but
a more explicit and principled
API and data format would be better.

Apart from a few basic types (int...) that should use the format
mandated by the targeted data format
or another standardized textual representation when there is none
(say, QUuid::toRfc4122 for JSON),
I argue that all data types should be "record/object/struct" with
meaningful property names.

Indeed, the QML types already look like a very nice and useful
de-facto standard for exchanging data
on top of JSON/CBOR :

https://doc.qt.io/qt-5/qmltypes.html
https://doc.qt.io/qt-5/qml-rect.html

According to this approach, no need to enter wasteful bikeshedding :

QRect::zap(...)  {
  return value.record("rect")
    .item("x",x1)
    .item("x",y1)
    .item("width", x2-x1)
    .item("height",y2-y1); // TODO use setHeight() if (value->mode()==Read)
}

> > There are not much alternatives
> > in how to serialize QList and the ones in CBOR are covered
> > (definite/indefinite). QPoint::zap or, say, QColor::zap may not fit
> > all use cases but the API allows to easily define another serialization.
>
> I do not agree. Please back up that statement with data: how many types would benefit from a standardised serialisation?

What kind of data? I think all of the QML types above would benefit
from a standard way
to serialize to/from other data formats. Are you asking which ones
would work in read/write
using the fluent interface only?

> And why are you including item models in this, what do they have to do with serialisation?

The proposed API handles serialization as simply iterating over
structured data with
some Reader/Writer. The proof-of-concept demonstrates having a
standard API for this
would make many use-cases become a simple one-liner like:

// Fill the table view with the CSV file content
QCsvReader(&csvFile).read(QItemModelWriter<>(&tableview).value());

That is the kind of simplicity, or absence thereof, that makes people
move away from C++
to Python (cf list comprehensions).

> > hey, I used to understand assembly and now it looks strange to me!
>
> Ok, that's actually a good point: assembly is very different from Qt-style
> C++, but it's not usually mixed with C++. What you're proposing is a C++very,
> very different API style that matches nothing in Qt and is meant to be used alongside Qt-style C++.
>
> There is no example of fluent-like API in Qt. The closest we'll see of it in
> C++ is the ranges API with operator|, which are meant to imitate a shell
> pipeline anyway.

C++ has evolved from inline assembly to ranges. I do not think Qt
should stop in the middle.
QDataStream streaming interface is a fluent interface hidden behind a
cryptic operator.
Yet, most people find it convenient and some are even happy to not have
to understand the trick of passing along the QDataStream&.

> As I said, I don't like it and I fear a very different API could be confusing.
> I understand what it's meant to do. I am saying I don't like that it does all it does.

Ok, but then how can we make it easier to read some CBOR into C++ types?
Build a QCborValue in memory, and then iterate over it to copy the
data to our own C++ types?
That will always be possible but my API would require much less work,
be safer and far more efficient.

> It's also advising people to write inefficient code. For example, de-
> serialising a record requires repeated iterations and is either O(n log n) for
> sorted records (like JSON) or O(n²) for unsorted ones like CBOR. In
>   value.record()
>     .item("a", a)
>     .item("b", b)
>     .item("c", c);
>
> The .item() functions need to scan the QCborMap or QJsonObject container for
> the values being searched, which is O(n) and O(log n) respectively. A more
> efficient implementation would iterate over the keys, which is O(1).

Actually they do not. Handling out-of order record items can be done
by implementations very
efficiently as demonstrated in the proof of concept and only requires
to cache QJsonValue
or pos() corresponding to unexpected record items.

Also, record is designed to handle struct/object, not large
associative containers
which should be serialized this way:

value.meta(qmColumns,"key,value").bind(map.iterator())

The benchmark shows my proposal is the most efficient solution to
provide flexible
serialization to an extensible set of C++ types apart from external
code generators [1].

Arnaud

[1] The option of making Read/Write mode() a type trait as in Boost
was tested and did
not bring noticeable performance while requiring zap() to become a
templated function.


More information about the Development mailing list