[Development] Regular expression libraries for QRegExp

lars.knoll at nokia.com lars.knoll at nokia.com
Tue Nov 22 10:58:02 CET 2011


On 11/22/11 2:11 AM, "ext Craig.Scott at csiro.au" <Craig.Scott at csiro.au>
wrote:

>
>On 22/11/2011, at 2:45 AM, Giuseppe D'Angelo wrote:
>
>> On 16 November 2011 16:08,  <marius.storm-olsen at nokia.com> wrote:
>>> Yes, the implementation based on UTF-8 vs UTF-16 version of PCRE would
>>> only differ on two lines, the UTF-16 -> UTF-8 and UTF-8 > UTF-16
>>> conversion before and after the matching.
>>> 
>>> I suggest we get started on this with the current version of PCRE, and
>>> hope that entices the PCRE team to work on a proper UTF-16
>>>implementation.
>>> 
>>> Anyone interesting in jumping on this task?
>> 
>> I can volunteer some time :)
>> 
>> But first: do we all (esp. Thiago, Lars) agree to use the UTF-8
>> version for now (and pay for the pattern/subject string/offsets
>> conversions) and then write and enable a UTF-16 codepath when PCRE
>> ships with proper support for it (by detecting its version at
>> runtime)?
>> 
>> Also: what's the minimum PCRE version Qt should require? I see that
>> Debian 6 (stable) uses 8.02 [1], Ubuntu 10.04 LTS uses 7.8 [2]. For
>> other distributions of course YMMV. Is it OK to depend on even more
>> recent versions? For instance, PCRE 8.10 adds UCP support (basically
>> make \w \d etc. match the corresponding Unicode properties), and PCRE
>> 8.20 adds a JIT feature (which promises large perfomance benefits) [3]
>> [4].
>> Again: should we resort to depend on a "old" version, detect the
>> proper one at runtime, and optionally enabling those features?
>> 
>
>
>I would suggest that the Qt source should include its own local copy of
>pcre and a configure time switch should allow selection between the
>system or the local (Qt source) version of pcre. This is already the
>approach offered for things like image-related plugins (libpng, libjpg,
>etc.). My main motivation for suggesting this is that, as far as I can
>tell (since the Linux Foundation servers are still not back up I can't
>confirm), pcre is not part of the LSB, which means if you want to build
>Qt to be LSB compliant, you will need to statically link to pcre rather
>than rely on the system libs. It would greatly simplify things if Qt's
>own build system made this pcre library selection easy and it would also
>have the benefit that if you are building Qt from source yourself rather
>than relying on the system wide Qt, then you are less at the mercy of
>what pcre version the system provides. This will also make it easier for
>Qt to more quickly take advantage of an updated UTF-16 pcre whe
> n it is implemented.

On the other hand I would really like to reduce the amount of 3rdparty
code that we have in qtbase.git, and not add another library there.

A IMO better solution would be to have a repository called e.g. qtsupport
(KDE had something similar for quite a while) that contains copies to
these 3rd party libraries for convenience.

Cheers,
Lars




More information about the Development mailing list