[Interest] Using UTF-8 code page with Qt5 on Windows?
alvinhochun at gmail.com
Wed May 18 14:29:41 CEST 2022
I am considering enabling UTF-8 as the activeCodePage ^ on Windows
(supported on Windows Version 1903 and beyond)  for Krita to
improve our situation with using Unicode file paths when interacting
with external C/C++ libraries. As I have not found any existing
discussions on this topic, I am now investigating how Qt (5.12 in our
case) would be affected under this configuration.
I suspect that, since QString uses UTF-16 and Qt should already be
using the -W version of Windows API, it should for the most part not
affect the operations of Qt. As far as I know, the only component that
would be affected is the system QTextCodec (qwindowscodec.cpp), which
is also used by QString::fromLocal8Bit and QString::toLocal8Bit.
Because it uses WideCharToMultiByte and MultiByteToWideChar with
CP_ACP, when activeCodePage set to UTF-8, CP_ACP now uses UTF-8
instead of the system ACP (e.g. Windows-1252, Big5, Shift JIS, ...)
In theory it should just work, but when reviewing qwindowscodec.cpp I
noticed code  that seems like it assumes the MBCS has only two
bytes maximum per character, which is not true for UTF-8 (in which a
Unicode code point can be composed by up to 4 UTF-8 code units.) The
same code exists in Qt 6, just moved to a different location . As I
am not familiar with how QTextCodec work, I cannot quite tell if this
is a real issue or not. Can anyone here give some advice?
I would also like to ask if Qt will officially support using UTF-8 as
the ACP on Windows.
^ Note: One way of setting activeCodePage to UTF-8 is by using the
Application manifest, which will apply the option in a per-process
manner . Another way is to enable it system-wide by enabling the
option "Beta: Use Unicode UTF-8 for worldwide language support" in
Region Settings .
More information about the Interest