[Development] The life of a file name and other possibly mal-encoded strings on non-Windows systems
kuba at mareimbrium.org
Thu Oct 9 08:46:49 CEST 2014
On Tue, Oct 7, 2014 at 11:41 AM, Thiago Macieira <thiago.macieira at intel.com>
> On Tuesday 07 October 2014 10:38:47 Kuba Ober wrote:
> > Just to be very clear: it is currently impossible to make a truly
> > file management utility with Qt’s core APIs. Why? Because it will simply
> > ignore all file names that it can’t decode when iterating the directory,
> > and it won’t be able to take commandline arguments to open such files
> > either. Furthermore, this is something that very basic C code using
> > but POSIX APIs can trivially deal with. Or that Python 2 trivially deals
> > with. I consider it a serious enough problem.
> That's where we disagree: those file names are not common at all.
I have a server that has had its filesystems established at the time of
RHEL 2, first Samba releases, with Windows NT 4 clients. There were
thousands of such files, I eventually grew tired of them being invisible to
Qt and fixed them all. Interestingly enough, on Windows machines the Qt
would see the files, because Samba was pretending really hard that the
names were representable in UTF-16.
The problem manifests itself almost anytime you plug in a small USB memory
stick that has localized file names on FAT-16 into any Unix system, from
the most modern OS X to legacy stuff that seems to have just learned that
USB storage exists. Sure, you could argue that the distribution should be
set up to ask the user for filename encoding on such a medium, or to
transparently do a best-guess transcoding to UTF-8, or whatnot. But the
reality is that none of this happens, and the files become invisible. The
worst part of the problem is that usually not all files with non-ASCII
characters vanish, only those that are not valid UTF-8 code unit sequences
do. It's a behavior that utterly confuses the users.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Development