[Development] HEADS-UP: QStringLiteral

Thiago Macieira thiago.macieira at intel.com
Wed Aug 28 07:21:06 CEST 2019

On Tuesday, 27 August 2019 16:57:55 PDT Kevin Kofler wrote:
> If you do not explicitly add ".UTF-8", glibc always gives you the obsolete
> legacy locale with the locale-specific pre-Unicode character set. This is
> intentional for backwards compatibility. So you should never use a locale
> without a ".UTF-8" suffix, unless, like Thiago, you want to deliberately
> test what happens in a legacy non-UTF-8 locale.
> The locales are interpreted by glibc. Anything that assumes that a given
> locale uses a character set different from what glibc actually uses for that
> locale is broken. (But it looks like GCC doesn't assume anything about the
> locale and just always uses UTF-8 to begin with, contrary to what the
> documentation claims.)

Indeed. The charset can be obtained with the nl_langinfo(3) function from the 
C library. Since there's no tool to print it for us, we use Python:

$ cat langinfo.py
import locale
$ python3 langinfo.py
$ LC_ALL=C python3 langinfo.py
$ LC_ALL=pt_BR python3 langinfo.py
$ LC_ALL=fr_FR at euro python3 langinfo.py
$ LC_ALL=el_GR python3 langinfo.py
$ LC_ALL=zh_CN python3 langinfo.py
$ LC_ALL=ja_JP python3 langinfo.py

I'm *so* glad I didn't remember three of the above and hadn't had to think of 
them for 15 years. (I thought Japanese on Unix used Shift-JIS and Russian used 

Anyway, doing a memory wipe. Aside from ISO-8859-1, I don't want to think of 
any of the others for another 15 years.
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products

More information about the Development mailing list