[Development] Are char literals L1 or U8 in Qt?

Thiago Macieira thiago.macieira at intel.com
Wed Jun 12 01:56:01 CEST 2024


On Tuesday 11 June 2024 12:08:45 GMT-7 Giuseppe D'Angelo via Development 
wrote:
> Il 11/06/24 07:12, Thiago Macieira ha scritto:
> > I'm arguing that such code is likely already broken (producing mojibake)
> > for
> >  non-US-ASCII content, so having U+FFFD instead of mojibake is not
> > worse. You wouldn't be able to work around the issue by un-doing the
> > improper encoding, which means it would force users to fix their code.
> 
> Is it? I somehow suspect that there's a lot of code out there that does 
> stuff like:
> 
>    string.indexOf('\xfc')   // search for ü

Indeed, but that's what I am arguing is somewhat broken. It works but it has a 
very limited usefulness because it only works for a small subset of the 
character set and definitely can't be from user input.

However, it works.

> Yet, breaking a ~20 year behavior in "low-level code" is ... scary? It 
> should require extraordinary motivation and care; we're probably talking 
> about making 6.8->6.14 warn if someone passes a non-ASCII char to 
> QASV/QChar(char)'s constructor, and change behavior to accept ASCII-only 
> in 6.15?

That's a fair argument. On the other hand, my argument is about not 
propagating old, erroneous behaviour to new API and, frankly, since we're 
somewhat inconsistent already, a little more wouldn't hurt. 

It starts to hurt when we begin replacing old API with new.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Principal Engineer - Intel DCAI Fleet Systems Engineering
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5152 bytes
Desc: not available
URL: <http://lists.qt-project.org/pipermail/development/attachments/20240611/9b66c8fc/attachment-0001.bin>


More information about the Development mailing list