Go to the FreeLists Home Page Home Signup Help Login
 



[dokuwiki] || [Date Prev] [02-2006 Date Index] [Date Next] || [Thread Prev] [02-2006 Thread Index] [Thread Next]

[dokuwiki] Re: invalid anchors

  • From: Andreas Gohr <andi@xxxxxxxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Fri, 10 Feb 2006 21:32:29 +0100
On Fri, 10 Feb 2006 15:13:27 +0100
Gesu` <petrax@xxxxxxxxxxx> wrote:

> I use MoinMoin and I like the way it handles anchors inside documents;
> in the TOC it creates links like:
> 
> <a href="#head-173e91fa82dc1bc8a1a9a78f9a464bb93f0e59e5">Cosa è
> Usenet?</a>

Looks like a hash (SHA1 maybe?) - I don't really like it

> I just made a test, and I discovered that also extended characters are
> allowed, [[(Anchor(aèiou)]] became:
> <a id="aèiou"></a>

Bad because it's not valid as we have learned.

I just pushed a patch which is the first part of making valid but still
nice anchor ids. The patch adds romanization to the UTF-8 library -
Denis already suggested this some days ago as the way how Wacko wiki
handles this. It is not a general solution but is good enough for
readable anchors in most languages (it currently covers cyrillic,
sanscrit, hebrew, arabic, japanese hiragana and katakana, greek, thai
and korean). As far as I can see there is no reasonable transcription
for chinese...

Here is how I want to implement the new ids:

1. Take the section title
2. Romanize and deaccent it
3. strip all remaining UTF-8 chars
4. strip all leading digits
5. if empty, call it "section"
6. check if id already exists, if yes concat "1"
7. repeat step 6 if the new one still exists, increasing the number
8. done

Step 1 to 4 is already done, the other ones aren't yet.

Andi
-- 
http://www.splitbrain.org
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist




[ Home | Signup | Help | Login | Archives | Lists ]

All trademarks and copyrights within the FreeLists archives are owned by their respective owners.
Everything else ©2007 Avenir Technologies, LLC.