[Development] QRegExp::W3CXmlSchema11 problems

Giuseppe D'Angelo dangelog at gmail.com
Fri Mar 9 20:15:05 CET 2012


Hi,

I'm working on a QRegExp::operator QRegularExpression() implementation
so that we can remove QRegExp from public API (reducing the breakages
to a minimum -- hopefully nobody was actually using implicit
conversions for those methods, and the QRegExp constructor is
explicit). But I'm having an hard time figuring out how to properly
port regexps in the W3CXmlSchema11 syntax.

This particular syntax is supposed to make use of XSD 1.1 regexps,
described in [1]. They're Perl-like regexps with some significant
differences, for instance:
- no lazy quantifiers
- support for character class subtraction: [a-z-[aeiou-[b-k]]]
- new multi-character escapes like \i and \c
- "ordinary" multi-character escapes (like \w , \s , etc.) matching
different Unicode subsets
- etc.

Now the problem is that the support for this syntax inside QRegExp is
quite broken: character class subtraction isn't supported, \i and \c
don't match what they're supposed to match, the other multi-character
escapes don't match the right Unicode subsets. This particular syntax
has *never* been tested in Qt (I added the very first test in Qt5 some
days ago), and the only two occurrences of its usage I can find are

  qtbase/examples/tools/regexp/regexpdialog.cpp:71:
syntaxComboBox->addItem(tr("W3C Xml Schema 1.1"),
QRegExp::W3CXmlSchema11);
  qtxmlpatterns/src/xmlpatterns/functions/qpatternplatform.cpp:200:
QRegExp retval(rewrittenPattern, Qt::CaseSensitive,
QRegExp::W3CXmlSchema11);

Therefore, I'm wondering if it's worth spending more time in writing
down a proper XSD -> Perl regexp conversion routine (like I'm
currently doing) and not ditch XSD regexps at all in the QRegExp ->
QRegularExpression conversion.

Any thoughts?

(For the records, the support was added by commit 210bd7b in Qt 4 [2],
but I don't know the rationale of the whole feature and the commit
message doesn't tell anything. Was it just for the xmlpatterns
module?)

-- 
Giuseppe D'Angelo

[1] http://www.w3.org/TR/xmlschema-2/#regexs
[2] http://qt.gitorious.org/qt/qt/commit/210bd7b6033e41aad61fe131002dc5e496d7427a



More information about the Development mailing list