[dokuwiki] Re: PHP Script to help migrate to UTF-8 file/directory names

  • From: Christopher Smith <chris@xxxxxxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Fri, 15 Oct 2010 18:26:04 +0100

On 15 Oct 2010, at 00:49, Daniel Dupriest wrote:

> Howdy Dokuwiki folks!
> 
> 
> I'd like to know if anyone familiar enough with Dokuwiki's ins and outs could 
> create such a script, or if something similar isn't already in the works. 
> Possibly a helper script like the one used when Dokuwiki did the utf8update? 
> Also, if there is, in fact, a good CLI tool that can do this in Linux please 
> let me know!
> 

You might try the following http://www.dokuwiki.org/tips:convert_to_utf8

The bash + php works.  I can't guarantee DokuWiki will like the results.  I 
believe it should for the following reasons:

1. utf8 page name comes into dokuwiki
2. dokuwiki strips any character that shouldn't be used
3. if necessary dokuwiki encodes the filename

The aim is to reverse an earlier encoding process at 3.  Since this is a 
straight encoding from utf8 all we should need to do is decode the result.  Any 
"bad" characters in the original utf8 page name were removed at step 2, so 
won't have been included in the encoded result.

Anyway, that is my logic.  I wouldn't throw away anything until you have 
successfully spidered your wiki.

Obviously this doesn't take into account any encoded page names that may be 
used in your wiki content.  Someone with more time might care to add error 
checking, do the directory recursion within php, build a plugin that uses 
dokuwiki functions to iterate over all the pages and do the conversions in 
place.

- Chris

Other related posts: