[Development] QTextBoundaryFinder behavior change in Qt-5.0

Konstantin Ritt ritt.ks at gmail.com
Sat Jul 28 13:34:25 CEST 2012


2012/7/28 David Faure <faure at kde.org>:
> I'm seeing a unittest failure in KDE Frameworks (sonnet framework) due to
> changes in QTextBoundaryFinder. This isn't my domain of expertise, so can I
> ask you to take a look at the information below, to find out if it's an
> intentional change or a bug, and/or if maybe the sonnet code is buggy?
>
> static bool
> finderNextWord(QTextBoundaryFinder &finder, QString &word, int &bufferStart)
> {
>     QTextBoundaryFinder::BoundaryReasons boundary = finder.boundaryReasons();
>     int start = finder.position(), end = finder.position();
>     bool inWord = (boundary & QTextBoundaryFinder::StartWord) != 0;
>     while (finder.toNextBoundary() > 0) {
>         boundary = finder.boundaryReasons();
>         if ((boundary & QTextBoundaryFinder::EndWord) && inWord) {
>             end = finder.position();
>             QString str = finder.string().mid(start, end - start);
>             if (isValidWord(str)) {
>                 word = str;
>                 bufferStart = start;
>                 qDebug().nospace() << "Word at " << start << " word="
>                          <<  str << ", len=" << str.length();
>                 return true;
>             }
>             inWord = false;
>         }
>         if ((boundary & QTextBoundaryFinder::StartWord)) {
>             start = finder.position();
>             inWord = true;
>         }
>     }
>     return false;
> }
>
> The unittest starts from the string
>    QString buffer( "This is     a sample buffer.Please test me .")
> and calls the above method repeatedly.
>
> The result with Qt5 is
>
> QDEBUG : SonnetFilterTest::testFilter() Word at 0 word="This", len=4
> QDEBUG : SonnetFilterTest::testFilter() Word at 5 word="is", len=2
> QDEBUG : SonnetFilterTest::testFilter() Word at 12 word="a", len=1
> QDEBUG : SonnetFilterTest::testFilter() Word at 14 word="sample", len=6
> QDEBUG : SonnetFilterTest::testFilter() Word at 21 word="buffer.Please",
> len=13
> FAIL!  : SonnetFilterTest::testFilter() Compared values are not the same
>    Actual   (w.word): buffer.Please
>    Expected (hits[hitNumber].word): buffer
>
> The result with Qt4 is
>
> QDEBUG : SonnetFilterTest::testFilter() Word at 0 word="This", len=4
> QDEBUG : SonnetFilterTest::testFilter() Word at 5 word="is", len=2
> QDEBUG : SonnetFilterTest::testFilter() Word at 12 word="a", len=1
> QDEBUG : SonnetFilterTest::testFilter() Word at 14 word="sample", len=6
> QDEBUG : SonnetFilterTest::testFilter() Word at 21 word="buffer", len=6
> QDEBUG : SonnetFilterTest::testFilter() Word at 28 word="Please", len=6
> QDEBUG : SonnetFilterTest::testFilter() Word at 35 word="test", len=4
> QDEBUG : SonnetFilterTest::testFilter() Word at 40 word="me", len=2
>
> So the dot character is no longer a word separator?

It is. But it is definitely not a whitespace while the current QTBF
implementation tries to _guess_ the WordStart/WordEnd attributes by
presence (or absence) of a surrounding whitespaces.
I have a patch that changes QTBF's behavior so that " . " won't be
treated like a word at all. This patch hardly depends on some other
patches that are in review stage, though.
I would appreciate if you extend the QTBF autotests with some of
Sonnet's testcases.

kind regards,
Konstantin



More information about the Development mailing list