[Development] Are char literals L1 or U8 in Qt?

David C. Partridge david.partridge at perdrix.co.uk
Wed Jun 12 12:06:23 CEST 2024


That's not the current encoding scheme for *national* flags.

The flag for Wales if done using "regional flag encoding" based on modifiers to "Waving black flag" (U+1F3F4) which I agree can have up to seven but in any case, the critical point we agree on is the > 1 byte issue...

D.

-----Original Message-----
From: Giuseppe D'Angelo <giuseppe.dangelo at kdab.com> 
Sent: 12 June 2024 10:02
To: Edward Welbourne <edward.welbourne at qt.io>; David C. Partridge <david.partridge at perdrix.co.uk>; development at qt-project.org
Subject: Re: [Development] Are char literals L1 or U8 in Qt?

On 12/06/2024 10:51, Edward Welbourne wrote:
> I'll trust Peppe's count is thus of bytes in UTF-8.

No, it's 7 code *points*. Regional flags have a complicated encoding 
scheme. Wales' flag is encoded as:

U+1F3F4 WAVING BLACK FLAG
U+E0067 TAG LATIN SMALL LETTER G
U+E0062 TAG LATIN SMALL LETTER B
U+E0077 TAG LATIN SMALL LETTER W
U+E006C TAG LATIN SMALL LETTER L
U+E0073 TAG LATIN SMALL LETTER S
U+E007F CANCEL TAG

Each one requires 4 UTF-8 code units, that is, a total of 28 bytes.

My point was that Unicode is incredibly complicated, and one should just 
use higher-level facilities that know how to do this.

My 2 c,
-- 
Giuseppe D'Angelo | giuseppe.dangelo at kdab.com | Senior Software Engineer
KDAB (France) S.A.S., a KDAB Group company
Tel. France +33 (0)4 90 84 08 53, http://www.kdab.com
KDAB - Trusted Software Excellence




More information about the Development mailing list