[Development] Deprecating QFile::encodeName/decodeName

Thiago Macieira thiago.macieira at intel.com
Fri Jun 8 12:18:14 CEST 2012


On quinta-feira, 7 de junho de 2012 14.58.03, Oswald Buddenhagen wrote:
> > Imagine a Qt application run from the command-line with:
> > 	qtapp *
> > 
> > In that directory there is a file name with broken encoding. The shell
> > will not recode (which is why I don't by the command-line encoding
> > argument).
> yeah, too bad.

Don't be so quick to dismiss it.

> > The Qt application should be expected to work and interpret it that
> > argument properly.
> 
> and how? have you read the part about the encoding being mount point
> specific?

I have. This applies only to filesystems that store data in Unicode, like VFAT, 
NTFS and  ISO9660+Joliet (and possibly UDF). For those, the user is expected 
to mount the filesystem with the proper option so that the filenames are 
rendered into the locale's encoding. If the proper setup was done, those 
filenames will not be a problem. If it was done improperly, then we fall into 
the next case.

The problem only arises from filesystems that don't store Unicode filenames, but 
plain 8-bit C strings, like all the Unix filesystems. For those, there's no 
concept of locale. Filenames are simply arbitrary data and can contain any 
byte but two: null and slash.

> how does the application know whether the caller did 8-bit
> pass-through or actually did the right thing and recoded to the locale
> encoding (which would be the case for example when you paste a correctly
> decoded filename from a gui to the command line)?

Like this: the application assumes that an 8-bit input contains the data that 
is obtained from the OS's 8-bit API. That's what most applications do today, 
that's what the shell would do when it expands *. For the same reason, an 
application *should* produce the same 8-bit form in its output. Not another.

There's no such thing as a "correctly decoded filename from a GUI". The GUI is 
still using the OS's 8-bit API and is subject to the same decoding problems as 
the command-line application. For that reason, both applications need to make 
the same encoding decisions.

> this is simply a no-win situation, and by trying to work around it you
> make it only worse by introducing unpredictability into the game.

Agreed. That's why I've given up on solving this problem completely.

My solution:

File names outside the locale are filesystem corruption. Qt applications do not 
need to handle them. Leave that for system administration applications.

> the lesser evil is imo assuming correct locale encoding when actually
> interpreting external input, being consistent within the qt realm when
> dealing with i/o functions, and having functions for 8-bit pass-through
> when dealing with external things which are just passed along
> (qprocessenvironment already has this; it should be possible to do the
> same for cmdline args by having laszlo's work integrate with qprocess as
> well).

I agree on that too. Which is why I am telling João that the idea that 
filesystem encoding != locale encoding is insane. It simply cannot be 
implemented properly.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel Open Source Technology Center
     Intel Sweden AB - Registration Number: 556189-6027
     Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.qt-project.org/pipermail/development/attachments/20120608/0fb89cfe/attachment.sig>


More information about the Development mailing list