[Development] RFC: Defaulting to or enforcing UTF-8 locales on Unix systems

Thiago Macieira thiago.macieira at intel.com
Wed Mar 22 22:50:12 CET 2023


On Wednesday, 22 March 2023 09:48:05 HST Volker Hilsheimer via Development 
wrote:
> Even if one Qt 5 application and one Qt 6 application exchange data over a
> local socket, unwisely using to/fromLocal8Bit for the purpose - if the Qt 5
> application continues to run with the system code page, then the Qt 6
> application starting to sending UTF-8 encoded data will break this.

QLocalSocket is very rare on Windows. And any decent socket code that is 
prepared to work over networks has either used proper 8-bit tagging to 
indicate the encoding (since 2001) or plain UTF-8 (since 2003).

The console is already a mess on Windows because it's not just the ACP for 
Win32 "A" API, but also the legacy DOS encoding (the mess that renders my 
middle name JosÚ or JosΘ). Since that is already a mess, I don't particularly 
find it problematic to see José now... wouldn't be the first time. Most Windows 
applications aren't console applications so this is a limited issue. It's also 
time-limited: those issues should smooth out easily with proper terminal 
applications, which is how we solved it in the Unix world too.

No, the far more likely scenario is interchange via files and via pipes to 
child processes. So yes, finding out what the legacy ACP is might be a useful 
piece of information. It shouldn't be the toLocal8Bit encoding, but it should 
be available should the need arise.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Cloud Software Architect - Intel DCAI Cloud Engineering
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5152 bytes
Desc: not available
URL: <http://lists.qt-project.org/pipermail/development/attachments/20230322/2ee90235/attachment.bin>


More information about the Development mailing list