[dokuwiki] Re: idea for improving the index

  • From: Wes <stararmy@xxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Thu, 12 Jun 2008 22:50:31 -0400

One thing I would like the search to do would be to display page
titles like in earlier versions. This has been broken since the 3/28
or 4/11 release I believe.

Attached are two screenshots of a brand-new installation, showing the behavior.

-Wes



On Thu, Jun 12, 2008 at 8:56 PM, Christopher Smith <chris@xxxxxxxxxxxxx> wrote:
>
> On 13 Jun 2008, at 00:01, Uwe Koloska wrote:
>
>>
>> So I have examined the indexer and found a simple solution: delete any URL
>> before the words are written into the index.  To find the URLs I use this
>> regular expression:
>>  "/[a-z]+:\/\/[^|\s]*/"
>>
>> <snip>
>>
>> So, what do you think about this?
>> - is this the right thing to do?
>>
>
> I don't think it is the right thing to do.
>
> It might be worthwhile to eliminate the "http", "www" & "com" from urls
> before sending the raw wiki text to the indexer, but I don't think its
> sensible to strip out the main part of the domain name - which is a useful
> search term.  However, the list of items to eliminate is likely to be
> complex, e.g. should it include "org", "co", "uk" or "de"?  Which implies a
> user configurable list.
>
> Given the added complexity, it might be more sensible to handle this in a
> plugin - attached to an event which allows filtering of raw wiki text before
> handing it to the indexer ... or perhaps a more complex set of events to
> allow for replacement search/indexing mechanisms.
>
> - Chris
> --
> DokuWiki mailing list - more info at
> http://wiki.splitbrain.org/wiki:mailinglist
>

Attachment: Image8.png
Description: PNG image

Attachment: Image9.png
Description: PNG image

Other related posts: