[Qt-creator] "Editing not possible" solution

Frédéric Marchal frederic.marchal at wowtechnology.com
Sun Jan 25 14:24:20 CET 2015


On Friday 23 January 2015 15:27:05, Ziller Eike wrote :
> > On Jan 21, 2015, at 4:07 PM, Frédéric Marchal
> > <frederic.marchal at wowtechnology.com> wrote:
> > Note that simply running 'file *.cpp" on a project directory under
> > Linux does report files encoded in UTF-8 and iso-8859-1. Maybe its
> > algorithm might be the intermediate solution?
> 
> Actually when I run ‘file’ on a text file that contains chinese characters
> in GB2312 (Simplified Chinese), then it reports /tmp/ch.txt: ISO-8859 text
> 
> That is not very helpful either ;)

Indeed :-)


> - Anything can be opened with ISO Latin 1 without decoding errors (just
> that the result is “gibberish”) - Even the other way round, e.g. “©Ötzi”
> (which doesn’t successfully decode with UTF-8, so our warning pops up)
> successfully decodes in GB2312 Simplified Chinese (just that the result is
> (probably) “gibberish”)
> 
> So, just trying any combination of text codecs to find one that succeeds
> will most probably result in the wrong encoding. On the other hand I do
> not want 10000 lines of code for a fancy guessing-algorithm in Qt Creator,
> where success is even also doubtful since we cannot assume that sensible
> code does not contain what that algorithm considers “gibberish”.

That's my opinion too. I can't imagine any practical fully automatic algorithm 
to determine the correct encoding in most cases.

I have source files where the only non-ascii characters are found in a couple 
literal strings like "%1°C" or "%1µm" or "%1€". There is no way statistics can 
help. And as every messages or comments are in English, it is not possible to 
guess the language based on the vocabulary.


> What I can imagine to reduce the pain in Qt Creator, is to let it remember
> the used encoding for a file (if it is different from the default) and use
> it when reopening the file.

If it is possible to store that information and override it in case the file 
encoding is changed at some point, then it would greatly help.

Does the file encoding have to be stored in a way that's compatible with 
Version Control Systems? Does anybody know of a VCS that changes the encoding 
when checking out files?

If it is safe, the encoding might be saved à la vim, in a comment at the end 
of the file. It would spare the reader from guessing the encoding of a file 
retrieved from a lambda project on the web.

 
> Maybe also a “fallback” encoding setting that is put into a quick access
> button directly into the “cannot open with encoding” info bar (so there
> would be a button “select encoding” and “use XYZ”), for people that
> regularly have to handle one additional “funny” encoding. (default: ISO
> Latin 1 ?) Would that be considered helpful?

Sorting the list of possible encodings with the most used at the top would 
already help a lot.

Frederic



More information about the Qt-creator mailing list