[Development] QtRE: ICU and Windows

Koehne Kai Kai.Koehne at digia.com
Fri Mar 8 16:56:43 CET 2013



> -----Original Message-----
> From: development-bounces+kai.koehne=digia.com at qt-project.org
> [mailto:development-bounces+kai.koehne=digia.com at qt-project.org] On
> Behalf Of John Layt
> Sent: Thursday, March 07, 2013 9:55 PM
> To: development at qt-project.org
> Subject: Re: [Development] ICU and Windows
> 
> On Thursday 07 Mar 2013 16:16:05 Koehne Kai wrote:
> > >> On 02/06/2013 11:20 PM, Koehne Kai wrote:
> > >> > [...]
> > >> > That is what we should do indeed. I learned from
> > >> >
> > >> > http://userguide.icu-project.org/icudata
> > >> >
> > >> > that one can also ship the ICU data in separate .data files,
> > >> > located in a "ICU>>
> > >> data directory" that can be specified e.g. at compilation time. So
> > >> how about creating a bare minimum icudt.dll, and rather ship .data
> > >> files in a well-known place ($$[QT_INSTALL_ICU])? This would allow
> > >> anyone with enough knowledge to tailor what exactly they want to
> > >> ship, while reducing the footprint of "I don't care about
> > >> localization for hello world" types of applications.
> > >
> > >Alright, this is what I found out so far: You can configure ICU to
> > >either place all the data in the icudt49 library, >or in one big .dat
> > >file at a specified location, or as individual files. Having multiple
> > >.dat files is supported >too, but that requires someone deciding how
> > >they should be split up.
> > >
> > >Shipping the library is what we have right now, which is IMO not
> > >acceptable. Just check out the comments on the >5.0.1 release blog,
> > >there are people still caring about 20 MB overhead :) Shipping the
> > >default .dat file would mean >hat either your app doesn't have any
> > >codec support etc at all, or has everything. If we ship individual
> > >files we'd >need to ship
> > >2345 files... that gives full flexibility, but good luck for the poor
> > >developer trying to find  out >what he needs :)
> > >
> > >So the bottom line for me is: _We_ have to come up with an ICU
> > >profile that contains what we consider important, and >which we want
> > >to ship in our default icudt library. If someone needs additional
> > >things he could add it by just >shipping e.g. an additional .dat
> > >file. [...]
> 
> > Thoughts? Comments? Praises? ;)
> 
> This is an issue that will only get worse if/when we hard require ICU for all
> the localization data, so we do need to work out an acceptable solution.
> 
> One of the ideas behind moving to ICU was to allow embedded devs to
> decide what locales they wanted to ship to save space, rather than having
> them all embedded in QtCore.  It's obviously a feature that the Windows
> devs would appreciate too, although 20MB really doesn't seem that much to
> me :-).

That depends on the viewpoint, I guess :) It's a lot if you have to add it to a minimal console app just using QtCore.

> I assume the idea of your patch is to use the http://apps.icu-
> project.org/datacustom/ to create the minimal .dat file and include that in Qt
> along with the empty dll?

No, actually not. The patch is just about being able to use an ICU that is compiled with '--with-data-packaging=archive'. That is, icudt library is empty, and all the contents are in a separate .dat file. The tricky part is however to make sure we find the .dat file at runtime ...

> One option to note is that according to http://userguide.icu-
> project.org/icudata in the "Reducing the Size of ICU's Data" sections you can
> modify the ICU build to put less data in the dll simply by removing the mk
> files for the conversion tables and locale and collation data you don't need.
> Keeping only the core conversion tables apparently reduces the data to
> about 5MB.  This would also save any messing with having to load a .dat file.

Sure, but the question is whether we can come up with a sensible default for the ICU libs in the official installer. There seems to be some data that current Qt / Webkit API's do not use, e.g. Collators. But if I de-select e.g. "Collators", "Rule Based Number Format", "Transliterators" from http://apps.icu-project.org/datacustom/ it's still 15,5 MB :(

> It's clear that we can't require devs to know how to pick and choose what ICU
> resources they do and don't need, as they won't know what ones are
> needed and which are optional.  Perhaps we need to write a tool or build
> script that helps devs with choosing what resources to include, and then
> deletes the mk files and builds ICU for them.  In this way we avoid having to
> build or ship ICU ourselves, but still make life easier for devs.  If the script
> could also tell qmake what features to enable/disable in Qt as a result then
> even better.

ICU has these tools already: It's http://apps.icu-project.org/datacustom/, icupkg, and gencmn . What's missing is IMO mainly some documentation which Qt modules / API's use which data.

> The first step is obviously to compile the list of all the resources we currently
> need and don't need, and the conversion tables are the obvious main target
> for reducing the size.

True.
 
> I'll make sure when I do the QLocale changes that devs can choose to only
> use the system locale resources on the understanding that they won't have
> custom locales or advanced features like collation available.

Sounds good :)

Regards

Kai



More information about the Development mailing list