[Development] Proposed solution to the ICU problem

Konstantin Ritt ritt.ks at gmail.com
Wed Aug 8 17:45:44 CEST 2012


2012/8/8  <lars.knoll at nokia.com>:
> For basic Unicode data that we need to render text, it might however make sense to keep a copy if this makes a big enough difference in performance.

Depends on what you mean by "a big enough difference in performance".

2012/8/8  <lars.knoll at nokia.com>:
> Let's just go through things and decide together where to best use ICU and where it makes sense to do our own stuff. But I'd like to see a decent justification for the places we want to keep our own data and/or implementation.

Okay, let's see:
* A proper BiDi algorithm implementation [UAX9] - a bunch of
improvements, optimizations and fixes (e.g. adding support for BMP
code points, the white spaces handling, respecting the paragraph
level, etc.);
* A proper script itemization [UAX24] - rewriting from scratch in
order to add the BMP code points support, properly handle "Common" and
"Inherited" script property values, etc.
  This also includes the Unicode script property values mapping for
the entire Unicode code points range and also significantly improves
the text analysis/segmentation quality and the text shaping/rendering
quality for the complex scripts with a decent-enough text shaping
engines (e.g. Harfbuzz-NG or Uniscribe/DirectWrite);
* A set of improvements for the QTextBoundaryFinder [UAX14, UAX29] -
of course, we can drop our implementation and simple make QTBF a thin
wrapper around ICU's grapheme/word/sentense/line iterators...;
* The [Unicode] Normalization Form Quick Check implementation
described in UAX14 - well, for the entire
normalization/composition/decomposition process it is possible to use
ICU and thus get rid of rarely used data from our QUnicodeTables.
However, it is a shame that we're sticking to a huge monster like ICU
just for such a simple things like context-based case folding,
normalization, or better CLDR support; the only thing we don't really
have in Qt is a collation support, which still can be implemented.


Konstantin



More information about the Development mailing list