[dokuwiki] Re: Search Result

So ... it seems to me that the obstacle to rendered searches is simply a 
speed issue. But perhaps an insurmountable one.

1) Indexing is good: it happens as the page is edited, and therefore 
rendering takes place anyway. So the cache is good, and we're OK if we 
wanted to index on the rendered text instead of the present method.
2) Snippets are bad: We can never guarantee that the dozen pages requested 
are already rendered, so (worst case) we'd have to render all the pages. Too 
slow.
3) Phrases are very bad. We'd have to render _every_ page not cached in 
order to compare. Much too slow.

What about a compromise/alternative? It seems to me that the biggest 
objection to raw searching is that it doesn't look "familiar" to the 
user--they see the markup. But what if we just stripped the wiki markup from 
the raw text--wouldn't that be fast and do-able, and look better to the 
user?

For instance, in the 3 places mentioned, there's a call to rawWiki() which 
could be replaced with a function to strip wiki tags. (Or rawWiki() itself 
could be modified to accept another parameter.) Simplistically, something 
like html_entity_decode(preg_replace('/[^\w\.!?"\']+/',' 
',rawWiki($id)),ENT_QUOTES) could be used, but there's probably something 
far better.

I'm flame-proof -- tell me what you really think of this.

Todd Augsburger
todd@xxxxxxxxxxxxxxxx
Roller Organs
www.rollerorgans.com

 

-- 
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: