[Interest] make qjsondocument recognize utf8 as utf8
Thiago Macieira
thiago.macieira at intel.com
Thu Dec 31 22:57:12 CET 2015
On Thursday 31 December 2015 19:29:11 Thiago Macieira wrote:
> > The offset: 494
> >
> > On Ubuntu 15.10, with the commercial 5.5.1, the output is Valid.
> >
> > > For that matter, in the file that produces the error, is it using CRLF
> > > line- endings? The one from your email does.
> >
> > It doesn't matter: I ran dos2unix on it, but it gives the same result.
>
> With the same offset?
With CRLF line endings, offset 494 is nowhere near an UTF-8 sequence (it's the
first 'y' in "Styczny normalny pędzel").
With LF lineendings, offset 494 is one byte before that 'ę'. I've checked and
QJsonDocument's parsing does NOT have an off-by-one error reporting of UTF-8
invalid sequences.
$ echo -e "[\"\x80\"]" | ./jsonvalidator /dev/stdin
Invalid "invalid UTF8 string" 2
There are no changes to either qjsonparser.cpp or qutfcodec_p.h since v5.5.1
that could account for this bug being fixed.
The only two things that I can think of to explain this problem are that
either CentOS packages applied a patch that broke the UTF-8 parsing or that
their compiler is generating bad code.
Sorry.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
More information about the Interest
mailing list