[Development] ICU decision?

John Layt jlayt at kde.org
Thu Aug 8 00:39:41 CEST 2013


On Wednesday 07 Aug 2013 13:39:36 Koehne Kai wrote:
> > 07.08.2013, 12:38, "guillaume.belz at free.fr" <guillaume.belz at free.fr>:

> > Most of ICU size comes from its data which can be loaded from separate
> > .dat
> > file, which can be customized. I believe most of developers don't require
> > all of ICU data (i.e. all kinds of data for all available locales), and
> > shrinking useless data will reduce package size greatly.

See my original email for an in-depth analysis of the data and library size.  
It's about 28MB on disk or 11MB compressed download.  This can be reduced to 
about 11MB disk / 6MB compressed for apps wanting just a few major Western 
European languages.  East Asian languages require rather more.  The absolute 
minimum is about 4MB with no data.

However to take advantage of these reductions each dev would have to build 
their own version of ICU with reduced data, either built-in or in the dat file.  
Most people I talked to didn't want to learn how to do that and just wanted 
something ready made for their needs.

Most people I talked to from the WIndows world were more concerned about the 
download size and effect on their bandwidth costs rather than disk usage, but  
mobile/embedded devs were obviously concerned about both.

It's fairly strainghtforward to script the build for devs to use, it's just 
convincing them the dependency and download is worth it.

> I've been pondering with shipping ICU with a separate .dat file. The problem
> is that the file has to be loaded before any ICU api is called - this is
> tricky to ensure from within Qt. It's also not that easy to actually find
> the ICU file - e.g. loading qt.conf already requires codecs.
> 
> Finally, deciding what you actually want is not that easy, even with
> http://apps.icu-project.org/datacustom/ . Power users can certainly do so,
> but it's certainly not newbie friendly.

I think we concluded that finding the dat file would be near impossible to do 
reliably.  Personally I think it's easier to modify the build to exclude the 
unwanted data from the library than trying to use the  Data Customiser which 
is almost impossible for anyone to know what to tick or not tick.

> > Also, with zero-size fake data ICU will
> > be still functional, e.g. able to perform conversions between UTF flavors,
> > SCSU and BOCU-1.
> 
> True, but that's not free either: Just icuuc and icuin are around 4 MB
> together.
> 
> Personally, I think ICU should be entirely optional for QtCore on Windows.
> Windows API's offers much of the functionality we're using ICU for. QtCore
> could then optionally try to load an ICU plugin for additional data ...

Yes, the Win32 api provides enough for the current QLocale api, but may not 
for all the extra advanced features we want to offer.  Note too that the data 
is only part of the problem, it's as much the code we want to get rid of too.  
We don't want to be implementing and maintaining lots of advanced localization 
code ourselves.

John.




More information about the Development mailing list