[dokuwiki] Re: PHP Script to help migrate to UTF-8 file/directory names

  • From: WC Jones <wcjones@xxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Fri, 15 Oct 2010 11:24:56 -0400

On Fri, Oct 15, 2010 at 10:59 AM, Daniel Dupriest <kououken@xxxxxxxxx> wrote:

> Pagename:
> 新たな社会的ニーズに対応した学生支援プログラム
> Becomes:
> %E6%96%B0%E3%81%9F%E3%81%AA%E7%A4%BE%E4%BC%9A%E7%9A%84%E3%83%8B%E3%83%BC%E3%82%BA%E3%81%AB%E5%AF%BE%E5%BF%9C%E3%81%97%E3%81%9F%E5%AD%A6%E7%94%9F%E6%94%AF%E6%8F%B4%E3%83%97%E3%83%AD%E3%82%B0%E3%83%A9%E3%83%A0.txt


> The use of UTF-8 filenames doesn't seem to be a problem in my case. The fact
> that I have 1,600+ url-encoded filenames/directories and no clear way to
> convert them IS however.

I see; I misunderstood the original posting then  :P   The file names
are being stored as URL encoded when they likely should NOT be --
especially since the file system is perfectly happy with the multibyte
character set.

I guess nothing really has changed in the last two years :(   I ran
into a similar problem where 50+ Apple MacOS 10.5x clients were using
a RHEL system as an Apple share (without any issues) but then they
upgrades to 10.6x and all hell broke loose because even tho Apple is
UTF8 and RHEL is UTF8 the standards aren't completely compatible so
your mileage may vary when trying to be UTF8 compliant.

May I suggest?  Try attacking it from a MultiByte Japanese OS
standpoint and not from a purely PHP or Dokuwiki one.

The thing to avoid is a massive translations or conversions without
stepwise testing to make sure what was expect is the actual filenames
you are seeing.

At any rate, please post back here (or e-mail me directly) as I am
very interested in what solution path that you find gives the best
results.
--
DokuWiki mailing list - more info at
http://www.dokuwiki.org/mailinglist

Other related posts: