[Development] Regular expression libraries for QRegExp

Thiago Macieira thiago at kde.org
Wed Nov 23 15:35:57 CET 2011


On Tuesday, 22 de November de 2011 15:07:46 Giuseppe D'Angelo wrote:
> On 32bit Linux:
> 
>    text	   data	    bss	    dec	    hex	filename
>  387691	    728	    176	 388595	  5edf3	libpcre-
jit/lib/libpcre.so
>  260245	    580	     12	 260837	  3fae5	libpcre/lib/libpcre.so
>  159869	    416	     12	 160297	  27229	libpcre-
noucp/lib/libpcre.so
> 
> First line is with JIT and UCP, second UCP only, third neither of them.

Question: where does it get its Unicode tables from? Are they compiled in, or 
does it link to another library, such as ICU?

Here, my libpcre links to nothing but libc.

> Ok. My only concern was that const methods of the pattern will have to
> compile it (... modify the shared instance) behind the scenes. I'm not
> sure if you wanted to get rid of these behaviours (thus, the "compiled
> pattern" class).

The API must not suffer from implementation details. We make a good API, then 
we find a way to make it work.

At first thought, I'd say that the pattern class should be a regular, 
implicitly-shared, atomic copy-on-write value class. If you call a non-const 
method, it detaches.

There should be no const methods that modify internal caches. Period. If you 
compile the pattern, it's a non-const method and it detaches. The pattern is 
also not the matcher class, it does not keep the captures, so there's no data 
to be modified in the *pattern* when you do a matching.

It may provide a convenience contains() or indexIn() function that execute 
matching. Those must be const and not modify the pattern.

If you want to deal with captures, you need a different class from the pattern. 
That's the matcher class.

-- 
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.qt-project.org/pipermail/development/attachments/20111123/905bc5b9/attachment.sig>


More information about the Development mailing list