[Development] Are char literals L1 or U8 in Qt?

Marc Mutz marc.mutz at qt.io
Tue Jun 11 06:51:21 CEST 2024


On 10.06.24 23:13, Thiago Macieira wrote:
> On Monday 10 June 2024 05:39:26 GMT-7 Marc Mutz via Development wrote:
>> Since there are four bugs³ in QString::arg() that are all fixed by the
>> existing patch chain porting the whole thing to QAnyStringView, and
>> since the medium-term goal is to deprecate use of char for characters
>> and char[] for strings (QT_ASCII_WARN), anyway, I would like to fix
>> QASV(char) to mean QASV(QChar(char)), not redefine char literals as
>> UTF-8 and break many more users (QASV is relatively new; QChar(char) and
>> QString::arg(char) are there since before Qt 4).
> 
> I am all for fixing the incompatibility, but I am of the opinion that char-as-
> Latin1 was the wrong choice. It was wrong in Qt 4 and is still wrong now.
> My point is that a const char[] is a UTF-8 string, therefore each char in
> there is an UTF-8 code unit.
> 
> Anyone iterating character by character across two different encodings probably
> already has bugs.

While all of the above is true, I'm missing a way forward out of this 
situation here. AFAICT, we have three options:

- keep status quo (then all QAnyStringView-taking functions need to get
   a QChar overload, and will probably be ambiguous, so even with "status
   quo", we'd need to change _something_ O(#functions-using-QASV))
- change `char` to be consistently L1, phase out `char` as a character
   and string type (my proposal; AFAICT, limited impact (you can already
   compile with QT_NO_CAST_FROM_ASCII, even in Qt 5, to get very close to
   the future state; changes limited to QASV: only (O(1)))
- change `char` to be consistently U8 (in case it wasn't obvious, this
   _silently_ breaks (probably large amounts of) existing code - _now_,
   independent on the second step here (keep behaviour or also phase out
   char))

Or, fourth option: We can merge a patch (still to 6.8) that warns (or 
even fails to compile) if a user uses QASV(char), even outside 
QT_NO_CAST_FROM_ASCII, telling them to use char16_t or QLatin1Char 
instead, and then fix the behaviour in Qt 6.9, or even keep it as a 
first step toward Qt 7 behaviour (though I'd like to keep a port from a 
string-ish overload set to QASV source-compatible; it was a lot of work 
to get there).

Thanks,
Marc

-- 
Marc Mutz <marc.mutz at qt.io> (he/his)
Principal Software Engineer

The Qt Company
Erich-Thilo-Str. 10 12489
Berlin, Germany
www.qt.io

Geschäftsführer: Mika Pälsi, Juha Varelius, Jouni Lintunen
Sitz der Gesellschaft: Berlin,
Registergericht: Amtsgericht Charlottenburg,
HRB 144331 B



More information about the Development mailing list