[Development] Oslo, we have a problem</apollo 13> [char8_t]

Mon Jul 8 11:34:31 CEST 2019

Arnaud Clere (8 July 2019 09:46) wrote
> Instead of asking users to choose correct QByteArray methods depending
> on the data it contains, why not proposing them to explicitly say what
> it contains?
>
> //! Explicitely utf8 encoded byte array
> class QUtf8String : public QByteArray
> {
> public:
>     using QByteArray::QByteArray;
>     QUtf8String(const QByteArray &o) : QByteArray(o) {}
>     QUtf8String() : QByteArray() {}
> };
>
> Such QUtf8String can be used everywhere a QByteArray is.
> Qt implementors can fix QByteArray toUpper(), split(), etc. without
> having to guess what to do.  Users only have to specify where they use
> utf8 to make sure they use the correct functions.  Having a COW
> QUtf8String providing a migration path from ambiguous QByteArray seems
> in line with the addition of char8_t and u8* to C++ standard.

We could then move the string-ish methods to these derived string
classes, deprecating the QByteArray base versions (and not making them
virtual) in favour of using the right 8-bit string type.  (The base's
implementation could indeed use that of a derived string class.)

That would glitch the "use anywhere QByteArray is asked for", of
course (you'd get the base's version of these methods), but anything
that accepts a QByteArray and does string-ish things to it is flawed
anyway.  It should be deprecated in favour of a templated API taking a
string-type.

	Eddy.