[Development] Deprecating QFile::encodeName/decodeName
Thiago Macieira
thiago.macieira at intel.com
Wed Jun 6 17:36:30 CEST 2012
On quarta-feira, 6 de junho de 2012 16.51.14, João Abecasis wrote:
> > From there, we come to the conclusion that the QString representing such a
> > file name must contain special processing instructions (e.g., one or
> > more special characters). One form of special processing instruction is
> > escaping each character, like URLs do. The problem with the approach of
> > escaping is what to do when the escape character occurs in a file name.
> > If that is a possibility, the escape character needs to be escaped by
> > itself (like "\\" for backslashes in C or "%25" for percents in URLs). If
> > we use this approach, then we will not interoperate properly with non-Qt
> > applications when this character happens.>
> >
> >
> > The only sane solution, then, is to use a character that has a very small
> > chance of ever being used or, better yet, a zero chance (I don't think
> > there's any). If that happens, then this character will be close to
> > "untypeable" on the terminal. Not a big loss, I'd say.
>
> We could use some magic sequence. Windows, for instance, uses the "\\?\"
> prefix to support longer paths. We could use '<' and '>', which are rare
> but valid, we could give a specific meaning to sequences of 3 or more
> slashes.
>
> I don't have a concrete solution at the moment.
I really think we should not use a character that is easily used on file names,
and that includes <, >, commas, percents, backslashes, spaces, etc. It needs
to be a Unicode character that has a close-to-zero chance of being
intentionally used.
I recommend selecting one or two characters from the Unicode private use area
for this. We could use a non-character (such as U+FDD0), but that will cause
problems elsewhere. For example, if you add such a path to QTextBrowser, it
might do weird things. For another, such characters are dropped by the UTF-8
encoder and decoder, aren't allowed in D-Bus, etc.
This character will be all but "untypeable" on the command-line. I don't think
we care, though, since Qt applications are seldomly launched from the command-
line and, besides, if the user sees the broken file name anyway (in either
form), the user is likely to fix the problem.
> > If it was named "βιογραφικό σημείωμα.txt" in ISO-8859-7, the QString
> > representation would be:
> > /home/foo/έγγραφα/<escape>âéïãñáöéêü óçìåßùìá.txt
> >
> > That has the drawback of being hard to use when it comes to path
> > manipulation. Appending, prepending, extracting or inserting text could
> > have unexpected consequences.
>
> I think any such scheme should support both absolute and relative paths and
> should allow a relative path to be combined with an absolute path with:
>
> absolute-path + '/' + relative-path
If you append a slash, it unshifts back to normal. But imagine someone
appending a suffix. Thankfully, non-ASCII suffixes / extensions are really rare.
> > Limitations:
> > a) Qt-only, I don't expect anyone else to use such file names
> > b) if encodeName() isn't used properly, it leads to a bad encoding of the
> > file name onto 8-bit. Applications dealing with the filesystem need to
> > be extra careful so as to not show two representations of the same file.
> > c) for that matter, it's possible to produce an escaped form that matches
> > a regular file name
> > d) double representations are often a source of security issues if not
> > handled carefully (cf. overlong sequences in UTF-8)
>
> I don't see a) as such a big problem, since currently Qt can't even handle
> such file names. As for b) I think ideally we'd come up with something that
> makes the use of encode/decodeName invisible and doesn't require users to
> register their own encoding/decoding functions. c) is what we want to
> minimize.
>
> As for d), if we make it all transparent and handled in a seamless way in Qt
> the problem that remains is how those paths interoperate with other
> applications and user code. It really helps to minimize c).
I'm not sure I agree with your dismissal of D. I'd like to see more research
into this topic first.
> On the other hand we already have Qt-only paths in resource files and
> QDir::searchPaths(). We could easily use a well-known prefix for the
> special paths: url-encoded:/usr/joao/R%E9sum%E9.txt, which only supports
> absolute paths, but would already enable all items in my wish list.
> > As you can see, I didn't come up with this today. I've known these
> > alternatives for years. I don't think they're worth our time.
Search paths and the filesystem engines are misfeatures. One is gone, the other
not yet. They are potential security issues too.
Anyway, what I recommend for now:
1) immediately, de-inline QFile::decodeName and QFile::encodeName
2) un-deprecate them and update the text in changes-5.0.0
3) make QProcess use QFile::encodeName for its arguments (no-op right now)
4) make QCoreApplication parse its arguments using QFile::decodeName (no-op
right now)
5) idem for Laszlo's command-line parser class
Later, we can decide whether to add escaping to those functions.
However, I cannot agree with bringing the setter functions back. I do agree
with removing them completely, though.
--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel Open Source Technology Center
Intel Sweden AB - Registration Number: 556189-6027
Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.qt-project.org/pipermail/development/attachments/20120606/acbd2431/attachment.sig>
More information about the Development
mailing list