[Development] HEADS-UP: QStringLiteral

Kevin Kofler kevin.kofler at chello.at
Wed Aug 28 01:57:55 CEST 2019


Edward Welbourne wrote:
> clang, gcc read input the same with LC_ALL unset and set variously to C,
> POSIX, en_US, pt_BR, el_GR.  I note that none of these explicitly
> selects an encoding, so the doc above is indeed consistent with gcc
> guessing UTF-8 based on the value of LC_ALL.  Even if the only el_GR or
> pt_BR locale your host actually has the necessary data compiled for are
> the ones using an encoding incompatible with UTF-8, gcc need not have
> actually checked that if it - like QSystemLocaleData on Unix - only
> looks at the value of environment variables.

If you do not explicitly add ".UTF-8", glibc always gives you the obsolete 
legacy locale with the locale-specific pre-Unicode character set. This is 
intentional for backwards compatibility. So you should never use a locale 
without a ".UTF-8" suffix, unless, like Thiago, you want to deliberately 
test what happens in a legacy non-UTF-8 locale.

The locales are interpreted by glibc. Anything that assumes that a given 
locale uses a character set different from what glibc actually uses for that 
locale is broken. (But it looks like GCC doesn't assume anything about the 
locale and just always uses UTF-8 to begin with, contrary to what the 
documentation claims.)

        Kevin Kofler




More information about the Development mailing list