[Development] QTextCodec removal and QXmlStreamWriter
Kai Pastor, DG0YT
dg0yt at darc.de
Sun Nov 17 10:43:00 CET 2019
(From "RFC: Defaulting to or enforcing UTF-8 locales on Unix systems"...)
Am 17.11.19 um 01:55 schrieb Thiago Macieira:
> It all started with a change (see OP) about removing QTextCodec from the API
> and from QtCore. It seemed reasonable enough but it turned up quite a few
> kinks that hadn't been predicted. One of them, which may still be a
> showstopper, is QXmlStreamReader's inability to handle XML data encoded in
> anything except UTF-8, though a thorough search of all XML files in my system
> turned up exactly zero such files.
By default, QXmlStreamWriter outputs UTF-8. With QTextCodec removed,
will QXmlStreamWriter always output UTF-8? If so, will it be changed to
handle UTF-8 input as efficient as possible?
At the moment, the public API is just QString. So unless you have
QString already, you convert from UTF-8, Latin-1 or raw numerical types
to UTF-16 (QString), and then QXmlStreamWriter converts to UTF-8 for
output. The double conversion burns a lot of CPU and time, including
memory allocations, for what I consider a typical use case. As an
example, think of an SVG document where graphical "paths" are very long
sequences of letters and numbers which are known to be Latin-1 and to
not need any escaping. The effect can be studied by sending the
characters directly to the device instead of going through
QXmlStreamWriter::writeCharacters().
Latin-1 element names and attribute names are quite common, too. So they
might also be considered for avoiding the UTF-16 (QString) conversion step.
Kai
More information about the Development
mailing list