[Development] Why we have to remove codecFor... ?

Thu Jun 7 20:09:52 CEST 2012

Hi Thiago,

The real problem is caused by MSVC2005, as they can't generate UTF-8
exec-charset.

>From MSVC2010-SP1, they provided a workaround for this:
1) Source file saved with BOM
2) contains "#pragma execution_character_set("utf-8")"

This is not bad for Chinese users. Let's see what happened:

In Qt4, users normally use too different charset to for a
"cross-platform" application.
1) Under Windows, sources code saved with GB18030 (MinGW/MSVC)
2) Under Linux, sources code saved with UTF-8

And many Qt4 howtos written in Chinese using following code, which is
really not good.
#if XXXXX
QTextCodec::setCodecForCStrings(QTextCodec::codecForName("GBK"));
...
#else
QTextCodec::setCodecForCStrings(QTextCodec::codecForName("UTF-8"));
...
#endif

But now, If we using MSVC2010-Sp1 and GCC 4.6, Source files contains
non-ascii can be used in a cross-platform way. What we need is:
1). Source files saved with BOM (supported by GCC too.)
2). Add following lines:

#if _MSC_VER >= 1600
#pragma execution_character_set("utf-8")
#endif

To some extent, consider that users who love Qt5 will want to use
latest C++ compiler(MSVC2010-SP1+, GCC4.6+, maybe GCC4.5 works too),
this isn't a big problem.

Though this workaround still doesn't work for MSVC2008-SP1, But there
is a hotfix for it: http://support.microsoft.com/kb/980263

So the only problem is MSVC2005.

BTW,
I don't know whether other compilers support UTF-8 BOM or not :-)

----------------------------------------------------------------------------

Debao

On Thu, Jun 7, 2012 at 4:07 AM, Thiago Macieira
<thiago.macieira at intel.com> wrote:
> On domingo, 22 de abril de 2012 12.49.59, Thiago Macieira wrote:
>> So the solution to make everything work is:
>>  1) always use UTF-8 encoded files
>>  2) mark your US-ASCII strings with QLatin1String
>>  3) everything else will either auto-convert, or use QString::fromUtf8 or
>>      QStringLiteral
>>
>> Now, I *really*, really don't care about source code that doesn't follow
>> step 1. The C++ Standards Committee decided to give us Unicode strings (a
>> very  modern action), but did not bother to specify the input character set
>> for source code (a very 1980s action).
>>
>> For that reason, considering that we live in a global world and that source
>> code is often shared among people in different countries. I am assuming
>> that every developer will choose to use UTF-8 given the option and that
>> every compiler and every text editor can understand it with minimal pain.
>>
>> Any compiler or text editor that can't understand UTF-8 (without a BOM)
>> will  receive from me the label of "crap" and will not take into
>> consideration the problems users using them have with the plan above.
>
> Oh well. That was good while it lasted.
>
>        https://codereview.qt-project.org/#change,28086
>
> MS didn't get the memo. MSVC 2010 meets my conditions for "crap".
>
> --
> Thiago Macieira - thiago.macieira (AT) intel.com
>  Software Architect - Intel Open Source Technology Center
>     Intel Sweden AB - Registration Number: 556189-6027
>     Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
>
> _______________________________________________
> Development mailing list
> Development at qt-project.org
> http://lists.qt-project.org/mailman/listinfo/development
>

[Development] Why we *have to* remove codecFor... ?

[Development] Why we have to remove codecFor... ?