[Development] Qt::CaseInsensitive comparison is not the same as toLower() comparison
Thiago Macieira
thiago.macieira at intel.com
Wed Feb 10 22:36:01 CET 2016
On quarta-feira, 10 de fevereiro de 2016 19:46:49 PST Knoll Lars wrote:
> They should only compare true with full case folding rules. This is
> something we have so far not implemented in Qt; as you noted below we're
> still using simple case folding rules.
>
> This is actually somewhat similar to the case of comparing strings in
> different normalisation forms (composed vs decomposed), something we also
> don't do out of the box (ie. 'Ä' doesn't compare true with the combination
> of 'A' with the diacritical mark for umlaut.
>
> At some point it would probably be nice to implement support for comparing
> these correctly. This does however have a performance impact and many use
> cases might not want this comparison by default.
I would say this requires more items in Qt::CaseSensitivity (or a new enum).
CaseSensitive => no case folding, no normalisation
CaseSensitiveNormalized => no case folding, but normalised
CaseInsensitive => case-folded, no normalisation
CaseInsensitiveNormalized => both
We may need a comparison that uses lowercasing and/or uppercasing instead of
case-folding for contexts where that is important. I'm thinking specifically of
security issues in networking, due to how certain protocols may interpret
certain characters.
One particular case that comes to mind are case insensitive filesystems. What
"insensitiveness" do they use? Even if we ignore the Turkic I and ligatures
like ß and ff, we still have "simple" cases like mu and micron: on both Windows
and OS X, I can create both files.
$ touch $'\u03bc'.txt $'\u00b5'.txt
$ ls ?.txt
µ.txt μ.txt
This leads to security vulnerabilities like:
QString filename = QString::fromUtf8(socket.readAll());
if (filename.compare("µ.txt", Qt::CaseInsensitive) == 0) {
QFile f(filename);
if (f.open(QIODevice::ReadOnly)) {
socket.write(f.readAll());
return true;
}
}
return false;
[Why would you compare µ case insensitively? Because it wasn't the µ I was
concerned about, but the ".txt" part!]
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
More information about the Development
mailing list