[dokuwiki] Re: Search Result

  • From: "Jacob Steenhagen" <jacob@xxxxxxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Tue, 26 Feb 2008 14:30:27 -0500

On Tue, Feb 26, 2008 at 1:11 PM, Gerry Weißbach <gerry.w@xxxxxxxxxxxxxxxxxx>
wrote:

> Hum ... well it's a suggestion ... but its really the last thing before
> disabling ;)I'd prefer something that includes the DW search engine ... if
> theres nothing, I'll invent it ;)
>

I'm really new to DokuWiki, so I can't say it doesn't exist, though I'd
imagine if it did exist it'd be fairly easy to find... perhaps even the
default. It seems what you'd really want is not necessarily rendered text,
but rather text w/out the entities. Rendered text would make the bold text
bold and italic text italic (not a big deal) but also make heading text into
headings, lists into lists, etc. If you look at the synopsis on Google
search results (http://www.google.com/search?q=dokuwiki) it's basically the
text that's on the web page minus any HTML applied styling.

Inventing it may not be terribly difficult. It may be as easy as running the
text through a few regular expressions:
s/\{\{.*\}\}//
s/\*\*(.*)\*\*/$1/
s/\[\[.*\|(.*)\]\]/$1/
s/\[\[(.*)\]\]/$1/
s/(={2,6})(.*)\1/$2/
etc

There's a log of backslashes here because most of the entities in DokuWiki
also have special meanings in regular expressions. It may end up being more
complicated than that or maybe there's even a sanitize function already in
the DokuWiki source that gives the "plain text rendering" of the page.

-- 
http://jacob.steenhagen.us

Other related posts: