[Development] HEADS-UP: QStringLiteral

Thiago Macieira thiago.macieira at intel.com
Mon Aug 26 19:33:20 CEST 2019


On Monday, 26 August 2019 09:20:49 PDT Lars Knoll wrote:
> > GCC and Clang default to UTF-8 *unless* you pass -finput-charset to
> > something different, independent of what your locale is.
> 
> That wasn’t how I understood it. Here’s the corresponding man page entry
> from gcc:
> 
> -finput-charset=charset
>         Set the input character set, used for translation from the character
> set of the input file to the source character set used by GCC.  If the
> locale does not specify, or GCC cannot get this information from the
> locale, the default is UTF-8.  This can be overridden by either the locale
> or this command-line option.  Currently the command-line option takes
> precedence if there's a conflict.  charset can be any encoding supported by
> the system's "iconv" library routine.
> 
> I’m happy to be proven wrong, but to me this sounds like it’s getting the
> file encoding from the locale, if that one specifies a charset.

I think the documentation is wrong.

$ gcc -S -o - -xc++ - <<<'auto s = u8"€áęǽ";' | grep -F .string
        .string "\342\202\254\303\241\304\231\307\275"
$ LC_ALL=C gcc -S -o - -xc++ - <<<'auto s = u8"€áęǽ";' | grep -F .string
        .string "\342\202\254\303\241\304\231\307\275"
$ LC_ALL=POSIX gcc -S -o - -xc++ - <<<'auto s = u8"€áęǽ";' | grep -F .string    
        .string "\342\202\254\303\241\304\231\307\275"
$ LC_ALL=en_US gcc -S -o - -xc++ - <<<'auto s = u8"€áęǽ";' | grep -F .string
        .string "\342\202\254\303\241\304\231\307\275"
$ LC_ALL=pt_BR gcc -S -o - -xc++ - <<<'auto s = u8"€áęǽ";' | grep -F .string
        .string "\342\202\254\303\241\304\231\307\275"
$ LC_ALL=el_GR gcc -S -o - -xc++ - <<<'auto s = u8"€áęǽ";' | grep -F .string
        .string "\342\202\254\303\241\304\231\307\275"
$ LC_ALL=el_GR clang -S -o - -xc++ - <<<'auto s = u8"€áęǽ";' | grep -F .asciz 
        .asciz  "\342\202\254\303\241\304\231\307\275"
$ LC_ALL=pt_BR clang -S -o - -xc++ - <<<'auto s = u8"€áęǽ";' | grep -F .asciz
        .asciz  "\342\202\254\303\241\304\231\307\275"

$ LC_ALL=pt_BR ls doesntexist                                               
ls: cannot access 'doesntexist': Arquivo ou diret�rio inexistente
$ LC_ALL=el_GR ls doesntexist
ls: cannot access 'doesntexist': ��� ������� ������ ������ � ���������
$ LC_ALL=el_GR.UTF-8 ls doesntexist                                                  
ls: cannot access 'doesntexist': Δεν υπάρχει τέτοιο αρχείο ή κατάλογος

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel System Software Products






More information about the Development mailing list