[Development] Oslo, we have a problem</apollo 13> [char8_t]
thiago.macieira at intel.com
Mon Jul 8 19:24:58 CEST 2019
On Monday, 8 July 2019 12:42:51 -03 Arnaud Clere wrote:
> -----Original Message-----
> From: Thiago Macieira <thiago.macieira at intel.com>
> > I am not completely convinced of the benefit of adding of an owning UTF-8
> > string class, though I very much agree with a view over UTF-8 strings.
> > The reason is not the string class itself (alone it is definitely
> > useful), but the fact that it would muddy the waters as to what string
> > classes one should use in API. We might end up with some API using UTF-8
> > and some UTF-16.
> Indeed, this is already the case : QJsonDocument::toJson() returns a
Which is the expected behaviour, as it returns something suitable for transfer
over a socket, pipe to a process or to be saved in a file, like
QCborValue::toCbor(), QDataStream, QTextStream (operating over a QByteArray),
QXmlStreamWriter (operating over a QByteArray), QDomDocument::toByteArray(),
It's just that, unlike those others, it is also a UTF-8 encoded text string.
The XML ones, for example, can be configured to write under other encodings
and such information is stored in the XML header. CBOR and QDataStream are
> on which users can conveniently call toUpper() until some data
> from the field makes them understand it does not work...
And there's little we can do to prevent that. Even if we removed
QByteArray::toUpper and left it only in QLatin1String, people would still find
ways to uppercase. That's the reason I would prefer to keep it, with well-
defined and locale-independent semantics.
> Working for a
> regulated industry, getting rid of potential bugs is my #1 concern, not
> that of having more fancy utf8 features! However, if deriving a QUtf8String
> from QByteArray is inappropriate (of which I am not totally convinced...
> cannot see a Liskov-Substitution-Principle violation in this case), I
> understand the task may be daunting. It may be argued too that COW is not
> interesting for such strings and APIs can be fixed by using u8string, but
> then, you ask Qt users to master both QString and std::string like APIs...
We don't ask users to use std::string APIs. That is not a text class,
std::string is analogous to QByteArray. C++ does not have a text container
class and that's not going to come until at least 2023 (C++2b).
std::string, like QByteArray, is encoding-agnostic but has some string-like
convenience functions over a pure byte storage (like std::vector<byte>), like
searching for a substring occurrence, instead single value_type elements.
QByteArray does when we unified it with QCString in Qt 4.0.
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel System Software Products
More information about the Development