[Development] Regular expression libraries for QRegExp

Craig.Scott at csiro.au Craig.Scott at csiro.au
Tue Nov 22 02:11:26 CET 2011


On 22/11/2011, at 2:45 AM, Giuseppe D'Angelo wrote:

> On 16 November 2011 16:08,  <marius.storm-olsen at nokia.com> wrote:
>> Yes, the implementation based on UTF-8 vs UTF-16 version of PCRE would
>> only differ on two lines, the UTF-16 -> UTF-8 and UTF-8 > UTF-16
>> conversion before and after the matching.
>> 
>> I suggest we get started on this with the current version of PCRE, and
>> hope that entices the PCRE team to work on a proper UTF-16 implementation.
>> 
>> Anyone interesting in jumping on this task?
> 
> I can volunteer some time :)
> 
> But first: do we all (esp. Thiago, Lars) agree to use the UTF-8
> version for now (and pay for the pattern/subject string/offsets
> conversions) and then write and enable a UTF-16 codepath when PCRE
> ships with proper support for it (by detecting its version at
> runtime)?
> 
> Also: what's the minimum PCRE version Qt should require? I see that
> Debian 6 (stable) uses 8.02 [1], Ubuntu 10.04 LTS uses 7.8 [2]. For
> other distributions of course YMMV. Is it OK to depend on even more
> recent versions? For instance, PCRE 8.10 adds UCP support (basically
> make \w \d etc. match the corresponding Unicode properties), and PCRE
> 8.20 adds a JIT feature (which promises large perfomance benefits) [3]
> [4].
> Again: should we resort to depend on a "old" version, detect the
> proper one at runtime, and optionally enabling those features?
> 


I would suggest that the Qt source should include its own local copy of pcre and a configure time switch should allow selection between the system or the local (Qt source) version of pcre. This is already the approach offered for things like image-related plugins (libpng, libjpg, etc.). My main motivation for suggesting this is that, as far as I can tell (since the Linux Foundation servers are still not back up I can't confirm), pcre is not part of the LSB, which means if you want to build Qt to be LSB compliant, you will need to statically link to pcre rather than rely on the system libs. It would greatly simplify things if Qt's own build system made this pcre library selection easy and it would also have the benefit that if you are building Qt from source yourself rather than relying on the system wide Qt, then you are less at the mercy of what pcre version the system provides. This will also make it easier for Qt to more quickly take advantage of an updated UTF-16 pcre when it is implemented.


--
Dr Craig Scott
Computational Software Engineering Team Leader, CSIRO (CMIS)
Melbourne, Australia






More information about the Development mailing list