[Development] Unicode/i18n support

John Layt jlayt at kde.org
Tue Nov 29 22:41:58 CET 2011


On 25 November 2011 08:30, <lars.knoll at nokia.com> wrote:


> I have been thinking a bit on how to move forward with Unicode support in
> Qt lately. The current state is in my opinion not sustainable.
>
> Unicode and i18n support consists of quite a few different tasks. Roughly
> speaking, we currently have a handful of places where Unicode data and
> support handling is being done.
>


 * ICU
>        contains everything we need and more. Uses utf16 as the internal
> encoding.
>        The more contains things such as:
>                * calendaring systems
>                * Full (and fast) collation support
>                * Timezone handling
>                * Unicode 6.0
>                * Full case folding support (including localized folding)
>                * Localized data for cities, calendars and other stuff
>                * Probably quite a few other things I forgot
>
> My proposal would be to simplify this setup and start relying on ICU for
> many of the tasks. We would still expose things through a Qt API though.
> It would simplify the maintenance of our Unicode support, as we can rely
> on ICU for most things.
>



> Opinions?
>

I'm generally in favour, even if it means throwing away most of my work
from the last few months :-)  In QLocale it will definitely save us a lot
of code and maintenance , give advanced features at no extra cost, and
solves the locale data size problem for embedded platforms.  However
there's probably a few implications to work through before fully committing
to it.

I'm assuming we would use ICU for all parsing and formatting of numbers,
currency, dates, times, etc?  And that we would continue to use the host
system settings, i.e. where the user has set something other than the
locale default?  ICU does provide api to define what settings to use, not
just what locale is set, so this is covered.

Would we still keep the old Qt4 routines or discard them entirely? The
existing number parsers/formatters are quite deeply embedded in various
classes in particular for fast C locale parsing.  Removing them may have
wider implications that need checking, for example for the QValidator
classes I don't know if we can still have Intermediate states without our
own parsers?

There's highly likely to be subtle and not-so-subtle behavioural changes,
e.g. in how certain formats are interpreted, strictness of parsing, etc.
For example scientific notation in CLDR is usually 'E' but Qt4 always uses
'e'.  The date format codes in particular are different, and while my
changes for Qt5 were switching to using the CLDR codes, I did include a
compatibility mode to use the Qt4 codes which we couldn't do if we switch
fully to ICU.

Also, if we're no longer using our own routines, but for example just
reading the Windows format settings and passing those to the ICU routines,
then wouldn't it just be better/quicker to call the Windows routines
directly and save the read of the Windows settings?  We'd only use ICU if
Windows didn't provide a feature.  While subtle differences might then
appear between platforms, they would be consistent with all other apps
running on those platforms?

If we're breaking behaviour, will there also be room for more source
incompatible changes to align QLocale more closely to CLDR/ICU, be more
consistent with itself, or be more useful to KDE (see my earlier email
about QSystemLocale and other stuff [1]).  We already have to break source
compatibility slightly for the date/time api, and perhaps different api
will make the behaviour changes more obvious?

[1]
http://lists.qt-project.org/pipermail/development/2011-October/000025.html

For Time Zones, while we can initially use ICU as a data source and
backend, I think we will still need to read the host system Time Zones for
compatibility purposes and as the ICU tz file may be older than the system
tz file.  ICU will be a good source for consistent translations of the zone
names.

I've already been doing lots of work on QLocale so would be happy to work
on this if needed, especially as I already have the date/time api sorted,
and a lot of fixes to the Windows/OSX system locales.  I'll also rework my
existing QDateTime changes to be done in two stages, internal QDate
improvements and later QLocale/QCalendarSystem dependent changes.  Then I
need to figure out how to get the KDE locale to work in this scheme.

Cheers!

John.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20111129/3599a52f/attachment.html>


More information about the Development mailing list