[dokuwiki] Re: PHP Script to help migrate to UTF-8 file/directory names

  • From: Daniel Dupriest <kououken@xxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Fri, 15 Oct 2010 23:59:08 +0900

On Fri, Oct 15, 2010 at 11:37 PM, WC Jones <wcjones@xxxxxxx> wrote:

> On Thu, Oct 14, 2010 at 7:49 PM, Daniel Dupriest <kououken@xxxxxxxxx>
> wrote:
>
> > For example:
> > Pagename:
> > 新たな社会的ニーズに対応した学生支援プログラム
> > Becomes:
> >
> %E6%96%B0%E3%81%9F%E3%81%AA%E7%A4%BE%E4%BC%9A%E7%9A%84%E3%83%8B%E3%83%BC%E3%82%BA%E3%81%AB%E5%AF%BE%E5%BF%9C%E3%81%97%E3%81%9F%E5%AD%A6%E7%94%9F%E6%94%AF%E6%8F%B4%E3%83%97%E3%83%AD%E3%82%B0%E3%83%A9%E3%83%A0.txt
> > ...which seems to exceed some samba share filename length limit, making
> it
> > inaccessible over the network, and I'm sure causes other problems that I
> > haven't run into yet.
>
> If the source file system is UTF8 compliant and the target file system
> is UTF8 compliant why change anything?
>
> Have you tried a dual NFS mounted UTF8 file system and keeping
> original page name on Japanese source with a "quoted-printable"
> version only on the NFS share?  I would strongly avoid using
> converting and any translating filenames without an extreme amount of
> testing because unless something has changed in the last 2 years I
> don't feel anyone hs agreed on a good interoperable standard yet -- I
> ran into a few issues just trying to keep basic translations between
> English/French/Spanish clear after all the gyrations were done.
>
> Maybe it's just me but a true UTF8 aware filesystem can handle it even
> if the software cannot...
> --
> DokuWiki mailing list - more info at
> http://www.dokuwiki.org/mailinglist
>


Thanks for the reply WC Jones. I should have mentioned it in my original
post, but the UTF-8 filenames work great for NEW pages. I am able to create,
edit (both through the wiki and through nano or over the network) and manage
with no problems so far. Some plugins which rely on pagenames/namespaces may
or may not work with UTF-8 names at the moment, but until that functionality
has been tested, they can continue running by using ascii-only names.

The use of UTF-8 filenames doesn't seem to be a problem in my case. The fact
that I have 1,600+ url-encoded filenames/directories and no clear way to
convert them IS however. In regards to testing of any filename conversion, I
make backups religiously (thanks to the brilliantly simple Dokuwiki flat
files) so would have no problem reverting if something goes wrong.

-- 
Daniel  ( ̄ー ̄)b
--

Other related posts: