[dokuwiki] Re: Search Result
- From: Christopher Smith <chris@xxxxxxxxxxxxx>
- To: dokuwiki@xxxxxxxxxxxxx
- Date: Tue, 26 Feb 2008 20:49:18 +0000
On 26 Feb 2008, at 19:30, Jacob Steenhagen wrote:
On Tue, Feb 26, 2008 at 1:11 PM, Gerry Weißbach <gerry.w@xxxxxxxxxxxxxxxxxx
> wrote:
Hum ... well it's a suggestion ... but its really the last thing
before disabling ;)
I'd prefer something that includes the DW search engine ... if
theres nothing, I'll invent it ;)
I'm really new to DokuWiki, so I can't say it doesn't exist, though
I'd imagine if it did exist it'd be fairly easy to find... perhaps
even the default. It seems what you'd really want is not necessarily
rendered text, but rather text w/out the entities. Rendered text
would make the bold text bold and italic text italic (not a big
deal) but also make heading text into headings, lists into lists,
etc. If you look at the synopsis on Google search results (http://www.google.com/search?q=dokuwiki
) it's basically the text that's on the web page minus any HTML
applied styling.
Inventing it may not be terribly difficult. It may be as easy as
running the text through a few regular expressions:
s/\{\{.*\}\}//
s/\*\*(.*)\*\*/$1/
s/\[\[.*\|(.*)\]\]/$1/
s/\[\[(.*)\]\]/$1/
s/(={2,6})(.*)\1/$2/
etc
There's a log of backslashes here because most of the entities in
DokuWiki also have special meanings in regular expressions. It may
end up being more complicated than that or maybe there's even a
sanitize function already in the DokuWiki source that gives the
"plain text rendering" of the page.
--
http://jacob.steenhagen.us
Without wishing to put anyone off who is motivated to extend DW in
this direction, here are some thoughts...
The issue is that only a snippet of raw wiki text is displayed, a few
characters on either side of the highlighted search term to give that
term context. The snippet itself is not guaranteed to be well-formed
wiki text, making it futile to attempt to render it in the same way
that a page is normally rendered. Grabbing a snippet in this way,
while perhaps not pretty, is fast.
A couple of potential alternatives:
- To search on rendered content rather than (or in addition to) raw
wiki text would require a new/different search mechanism within
DokuWiki. While that maybe desirable, its probably not trivial. A
simpler alternative maybe to offer two search mechanisms, google (or
other SE) search using "<search terms> :mysite.com" syntax and wiki
search using current DW mechanism.
- To grab the entire rendered output for each page in the search
results and then to take a snippet of that output surrounding the
search term (if it still exists), is likely to unfeasible in terms or
page response time and also likely to be a non-trivial task.
-Chris--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist
- Follow-Ups:
- [dokuwiki] Re: Search Result
- From: Todd Augsburger
- References:
- [dokuwiki] Search Result
- From: Gerry Weißbach
- [dokuwiki] Re: Search Result
- From: Jacob Steenhagen
- [dokuwiki] Re: Search Result
- From: Gerry Weißbach
- [dokuwiki] Re: Search Result
- From: Jacob Steenhagen
Other related posts:
- » [dokuwiki] Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
- » [dokuwiki] Re: Search Result
I'm really new to DokuWiki, so I can't say it doesn't exist, though I'd imagine if it did exist it'd be fairly easy to find... perhaps even the default. It seems what you'd really want is not necessarily rendered text, but rather text w/out the entities. Rendered text would make the bold text bold and italic text italic (not a big deal) but also make heading text into headings, lists into lists, etc. If you look at the synopsis on Google search results (http://www.google.com/search?q=dokuwiki ) it's basically the text that's on the web page minus any HTML applied styling.
Inventing it may not be terribly difficult. It may be as easy as running the text through a few regular expressions:
s/\{\{.*\}\}//
s/\*\*(.*)\*\*/$1/
s/\[\[.*\|(.*)\]\]/$1/
s/\[\[(.*)\]\]/$1/
s/(={2,6})(.*)\1/$2/
etc
There's a log of backslashes here because most of the entities in
DokuWiki also have special meanings in regular expressions. It may
end up being more complicated than that or maybe there's even a
sanitize function already in the DokuWiki source that gives the
"plain text rendering" of the page.
-- http://jacob.steenhagen.us
- [dokuwiki] Re: Search Result
- From: Todd Augsburger
- [dokuwiki] Search Result
- From: Gerry Weißbach
- [dokuwiki] Re: Search Result
- From: Jacob Steenhagen
- [dokuwiki] Re: Search Result
- From: Gerry Weißbach
- [dokuwiki] Re: Search Result
- From: Jacob Steenhagen