[Development] Are char literals L1 or U8 in Qt?
David C. Partridge
david.partridge at perdrix.co.uk
Wed Jun 12 10:30:30 CEST 2024
Nope just TWO code points e.g. U+1F1FA: REGIONAL INDICATOR SYMBOL LETTER U) followed by 🇸 (U+1F1F8: REGIONAL INDICATOR SYMBOL LETTER S) for the US flag,
-----Original Message-----
From: Development <development-bounces at qt-project.org> On Behalf Of Giuseppe D'Angelo via Development
Sent: 11 June 2024 20:09
To: development at qt-project.org
Subject: Re: [Development] Are char literals L1 or U8 in Qt?
Il 11/06/24 11:36, David C. Partridge ha scritto:
> Anyone iterating bytewise over a char[] in UTF-8 has also got serious
> bugs given that a UTF-8 "graphic character" can be up to 8 bytes
> (national flags comprise two UTF-8 code points).
There's no such thing as a UTF-8 "graphic character". Grapheme sequences are treated at a higher level anyhow in Qt, and we have APIs for that (QTextBoundaryFinder, etc.).
And it's not 2. 🏴 is 7 code points.
My 2 c,
--
Giuseppe D'Angelo | giuseppe.dangelo at kdab.com | Senior Software Engineer KDAB (France) S.A.S., a KDAB Group company Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com KDAB - Trusted Software Excellence
More information about the Development
mailing list