[Development] RFC: Defaulting to or enforcing UTF-8 locales on Unix systems
Thiago Macieira
thiago.macieira at intel.com
Mon Nov 4 23:27:23 CET 2019
On Monday, 4 November 2019 10:55:03 PST Thiago Macieira wrote:
> I'll do a full search on Clear Linux to see if there's any software that
> checks the return value of setlocale().
All "setlocale" calls.
First, the calls that to strcmp: I found comparisons in gnulib and
replacements for setlocale, which don't count (they're replacement for old
systems Qt no longer [has never?] runs on). That left a couple of examples of
exactly what you predicted:
glfw-3.3/src/x11_init.c: if (strcmp(setlocale(LC_CTYPE, NULL), "C") == 0)
https://github.com/glfw/glfw/blob/master/src/x11_init.c#L934-L942
hack around C not supporting wide-char, which wouldn't be needed if we set the
environment
firefox-60.1.0/xpcom/build/XPCOMInit.cpp: if (strcmp(setlocale(LC_ALL,
nullptr), "C") == 0) {
https://searchfox.org/mozilla-central/source/xpcom/build/XPCOMInit.cpp#337
the next line does setlocale(LC_ALL, "")
wxWidgets-3.1.2/src/common/intl.cpp: wxASSERT_MSG(
strcmp(setlocale(LC_ALL, NULL), "C") == 0,
https://github.com/wxWidgets/wxWidgets/blob/master/src/common/intl.cpp#L1694
Appears to be Windows-specific.
The assignments are much more numerous (1700 of them in my listing). A lot of
them are of the form:
old_locale = setlocale(LC_xxx, NULL);
which I assume is later followed up by a setlocale(LC_xxx, old_locale). These
cases are not relevant to us.
https://github.com/GNUAspell/aspell/blob/master/common/config.cpp#L549-L561
Needs to find the locale to know what language to apply spelling for and also
how to decode the text. UTF-8 is supported.
http://git.savannah.gnu.org/cgit/bash.git/tree/locale.c
Aside from the check *for* UTF-8 in LC_CTYPE, the assignments are only
checking for null pointers.
http://git.savannah.gnu.org/cgit/bison.git/tree/src/getargs.c#n446
http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/system.h
Not relevant for us.
https://github.com/BOINC/boinc/blob/master/zip/zip/zip.c#L2214
Null check only, and checks for UTF-8
https://github.com/BOINC/boinc/blob/master/zip/unzip/unzip.c#L773
Not relevant, in #else for nl_langinfo
https://github.com/microsoft/cpprestsdk/blob/master/Release/src/utilities/
asyncrt_utils.cpp
Win32 only
https://github.com/apple/cups/blob/master/cups/language.c
Handles UTF-8 just fine.
https://github.com/apple/cups/blob/master/cups/langprintf.c
Forces .UTF-8.
https://github.com/doxygen/doxygen/blob/master/qtools/qtextcodec.cpp#L508-L529
Trying to guess what QTextCodec to use for ru_RU.
https://git.enlightenment.org/core/efl.git/tree/src/modules/ecore_imf/xim/
ecore_imf_xim.c#n832
Null check only. The rest of EFL is save/restore.
http://git.savannah.gnu.org/cgit/emacs.git/tree/src/sysdep.c#n4049
Null check only.
http://git.savannah.gnu.org/cgit/emacs.git/tree/src/sysdep.c#n4049
COULD mistake, as it does strcmp(locale, "C") then locale = "en"
https://github.com/GNOME/evince/blob/mainline/cut-n-paste/synctex/
synctex_parser.c#L4384-L4399
Save/restore.
https://github.com/GNOME/evolution-data-server/blob/mainline/src/camel/camel-iconv.c#L218
Does compare to "C", but not a problem since the failing case uses nl_langinfo
https://github.com/GNOME/evolution-data-server/blob/mainline/src/addressbook/
libedata-book/e-book-sqlite.c#L2891
Doesn't seem to be a problem.
https://github.com/GNOME/evolution/blob/mainline/src/e-util/e-xml-utils.c#L66
Just getting defaults.
https://github.com/fish-shell/fish-shell/blob/3.0.2/src/env.cpp#L373-L396
Comparing old to new. And no longer present in master.
https://github.com/fltk/fltk/blob/master/src/
Fl_Native_File_Chooser_GTK.cxx#L445-L458
Save/restore, not thread-safe.
https://github.com/zenotech/fox-toolkit/blob/master/src/FXTranslator.cpp#L84
Commented out.
http://git.savannah.gnu.org/cgit/gawk.git/tree/support/dfa.c#n988
Not a problem, just checking if the locale is ASCII-compatible.
binutils-gdb/blob/master/readline/readline/nls.c
Seems fine too.
https://github.com/geany/geany/blob/master/src/libmain.c#L980-L987
Only used in debug output
https://github.com/fangq/gftp/blob/master/lib/protocols.c#L382-L395
Null-pointer check & logging
https://github.com/GNOME/glib/blob/mainline/glib/guniprop.c#L724
Safe
https://github.com/GNOME/glib/blob/mainline/glib/gtranslit.c#L293
Seems to be fine
https://github.com/GNOME/glib/blob/mainline/glib/gdate.c#L1057-L1065
Checking cached results
I'm stopping here.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel System Software Products
-------------- next part --------------
A non-text attachment was scrubbed...
Name: setlocale-grep.zst
Type: application/zstd
Size: 101220 bytes
Desc: not available
URL: <http://lists.qt-project.org/pipermail/development/attachments/20191104/7daaad62/attachment-0001.bin>
More information about the Development
mailing list