[Development] ICU decision?
jlayt at kde.org
Thu Aug 8 00:39:41 CEST 2013
On Wednesday 07 Aug 2013 13:39:36 Koehne Kai wrote:
> > 07.08.2013, 12:38, "guillaume.belz at free.fr" <guillaume.belz at free.fr>:
> > Most of ICU size comes from its data which can be loaded from separate
> > .dat
> > file, which can be customized. I believe most of developers don't require
> > all of ICU data (i.e. all kinds of data for all available locales), and
> > shrinking useless data will reduce package size greatly.
See my original email for an in-depth analysis of the data and library size.
It's about 28MB on disk or 11MB compressed download. This can be reduced to
about 11MB disk / 6MB compressed for apps wanting just a few major Western
European languages. East Asian languages require rather more. The absolute
minimum is about 4MB with no data.
However to take advantage of these reductions each dev would have to build
their own version of ICU with reduced data, either built-in or in the dat file.
Most people I talked to didn't want to learn how to do that and just wanted
something ready made for their needs.
Most people I talked to from the WIndows world were more concerned about the
download size and effect on their bandwidth costs rather than disk usage, but
mobile/embedded devs were obviously concerned about both.
It's fairly strainghtforward to script the build for devs to use, it's just
convincing them the dependency and download is worth it.
> I've been pondering with shipping ICU with a separate .dat file. The problem
> is that the file has to be loaded before any ICU api is called - this is
> tricky to ensure from within Qt. It's also not that easy to actually find
> the ICU file - e.g. loading qt.conf already requires codecs.
> Finally, deciding what you actually want is not that easy, even with
> http://apps.icu-project.org/datacustom/ . Power users can certainly do so,
> but it's certainly not newbie friendly.
I think we concluded that finding the dat file would be near impossible to do
reliably. Personally I think it's easier to modify the build to exclude the
unwanted data from the library than trying to use the Data Customiser which
is almost impossible for anyone to know what to tick or not tick.
> > Also, with zero-size fake data ICU will
> > be still functional, e.g. able to perform conversions between UTF flavors,
> > SCSU and BOCU-1.
> True, but that's not free either: Just icuuc and icuin are around 4 MB
> Personally, I think ICU should be entirely optional for QtCore on Windows.
> Windows API's offers much of the functionality we're using ICU for. QtCore
> could then optionally try to load an ICU plugin for additional data ...
Yes, the Win32 api provides enough for the current QLocale api, but may not
for all the extra advanced features we want to offer. Note too that the data
is only part of the problem, it's as much the code we want to get rid of too.
We don't want to be implementing and maintaining lots of advanced localization
More information about the Development