[Development] utf-8 BOM and parsers
Thiago Macieira
thiago.macieira at intel.com
Mon Apr 14 16:14:44 CEST 2014
Em seg 14 abr 2014, às 15:13:53, Frank Osterfeld escreveu:
> On 14 Apr 2014, at 14:26, Simon Hausmann <simon.hausmann at digia.com> wrote:
> > Since this affects not just one place but many (and for example we have
> > many copies of the QML lexer around), I'd like to determine what the
> > _correct_ fix for this issue is, because frankly speaking I don't know
> > :). However I have an interest in the same fix being applied to qtbase,
> > qtdeclarative, qtscript, qtcreator and other affected modules.
>
> Even more critical, this behavioural change won’t only affect Qt modules,
> but also a lot of customer code, which cannot be fixed by us. Which makes
> me wonder if such a be a change between 5.2 and 5.3 is acceptable at all.
> Was it intentional or an unintended side-effect? I can’t find any
> discussion about the issue.
It was intentional as part of the UTF-8 codec rewrite.
> > 3) I noticed that QString::fromUtf8() differs from QTextCodec in this
> > aspect. Is that intentional?
>
> That inconsistency makes it even more confusing to me.
QTextCodec is stateful and allows you to choose, as one of the options,
whether to ignore the BOM or not. QString::fromUtf8 is stateless.
Anyway, I don't want to change the behaviour back, but if the consensus is
that it should be done, I'll prepare a patch and send to release.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
More information about the Development
mailing list