[Development] Oslo, we have a problem</apollo 13> [char8_t]
giuseppe.dangelo at kdab.com
Sun Jul 7 19:21:13 CEST 2019
On 06/07/2019 12:43, Mutz, Marc via Development wrote:
> C++20 is coming along, and it brings a disruptive change, one that far
> surpasses the C++17 noexcept break: u8"Hello" is now const char8_t, no
> longer const char.
> To estimate the amount of breakage this will cause, assuming that using
> u8"" is good practice today, to indicate that a string is in UTF-8. I've
> tried to have at least QByteArray not break... and failed.
The fact that is good practice is actually questionable, SG16 reports
that u8 encounters a very very limited adoption (and I, for one, have
not been suggesting its usage until the C++2a situation is clarified):
> Code surveys have so far revealed little use of u8 literals
> The initial idea is simple enough: add const char8_t* overloads for
> const char* functions. This breaks passing nullptr, so you also add
> std::nullptr_t overloads. This, however, still doesn't fix the case
> where a 0 is passed. I've expected that the std::nullptr_t overload is a
> preferred match over the const char[8_t]* ones, but GCC 9.1 disagrees,
> and tells me it's still ambiguous.
> So, if GCC is right, we have no way of adapting our API to not break in
> C++20. So we need to decide what to break:
> a) using 0 for nullptr, or
> b) using u8"Hello" at all
> The forward-looking choice would be to break (a) and support (b).
In the general case: break 0 instead of nullptr. Such code would fail
anyhow if one starts adding e.g. overloads taking other pointer types,
not specifically char8_t*; and adding overloads has to be acceptable in
the general case. Plus: we already have warnings for using 0 as nullptr
constant, and clang-tidy can automate migration. On the other hand, I'm
not sure about MSVC.
In the specific case: are we sure it makes sense to add a char8_t
constructor to QByteArray? Currently sits in the middle of being a pure
"std::byte vector" (e.g. it's used to transmit raw bytes from I/O
devices. etc) and a US-ASCII (?) string (e.g. given some of its APIs,
like toUpper()). By no means it's a container of UTF-8 encoded strings
and we shouldn't give the illusion that it is.
Of course there's plenty of other APIs that instead will need a
resolution... just to name one: QString::fromUtf8.
My 2 c,
Giuseppe D'Angelo | giuseppe.dangelo at kdab.com | Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com
KDAB - The Qt, C++ and OpenGL Experts
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 4329 bytes
Desc: S/MIME Cryptographic Signature
More information about the Development