[haiku] Re: Migration from Basho to Official Localization of Haiku

  • From: "François Revol" <revol@xxxxxxx>
  • To: haiku@xxxxxxxxxxxxx
  • Date: Fri, 30 Oct 2009 22:51:34 +0100 CET

> As I am sure you understand by now, what the addon does is convert an
> SHIFT-JIS encoded filename which displays grabled into a readable UTF
> -8
> encoded filename.
> 
> The files with SHIFT-JIS encoded filenames do originate in Windows
> machines (which are pervasive, btw), but they can come to your 
> desktop
> in a variety of ways, and not just from a direct file copy from a FAT
> volume. In my particular case, I get such files mostly via email
> attachments and ZIP files, but there may be other scenarios which I 
> am
> not aware of. So this...

Ok, and we can't kill all windows users around, so... ;)

> >From what I recall, unzip-latin allows to properly list files with
> SHIFT-JIS encoded filenames, which would otherwise be displayed 
> garbled
> (when you look at the content of the archive).
> 
> My understanding is that it changes nothing when extracting the file 
> and
> that the SJIS2UTF8 Tracker addon still has to be used after file
> extraction.

I see, well it's the same issue as with zip made from windows here 
sometimes, just with iso encodings usually you only have a few chars 
garbled, so you can usually guess, which I suppose is not the case for 
SHIFT-JIS.

There doesn't seem to be an easy solution to this anyway, since neither 
FAT nor zip carry the fs encoding used (plus usually when you work with 
many OSes you end up with files with UTF-8 names and others with iso 
names in the same folder anyway).

I just don't want to end up with 10 different tools, one for russian, 
one for ...

Maybe SJIS2UTF8 could be made more flexible, with an heuristic to guess 
source encoding, suggest changes with an OK/cancel dialog ?

Things like *Emacs or irssi do have quite good methods of guessing 
encodings.

Maybe ICU has some, Adrien ?

François.

Other related posts: