[dokuwiki] Re: search improvements
- From: Guy Brand <gb@xxxxxxxxxxxxxxxxx>
- To: dokuwiki@xxxxxxxxxxxxx
- Date: Thu, 31 Aug 2006 14:35:32 +0200
On 31 août at 01:58, Chris Smith wrote:
> Hi,
Hi Chris,
> My preference would be to run with the utf8 algorithm.
+1
> For my test wiki two similar search terms producing similar results, but
> selected from opposite ends of an 11,000+ word index resulted in a
> doubling of the search time. Guy found similar results for single
> search terms at opposite ends of his ~10,000 word index. It would seem
> the bigger the wiki, the more words likely to get in the index, the
> slower, on average, searching is likely to be. Ideas on improving this
> are welcome :-)
Use a db to store these indexes? :-)
AFAIU, a search on a word/pagename is a list containing the line
number the word appears in word.idx, than the numbers are lines read
from index.idx (line numbers in word.idx are the same as in
index.idx) on which are listed one or several pageid:hits-in-page.
When the content of a wiki grows, both index.idx and word.idx
inflate. I have a 100 MB weighted wiki with 6k files, and had to
build the index using an external shell script (php binary not
available on the target host) but it was useless, as php needed more
than 80 MB memory to process the two 19 MB indexes files.
How about merging the index.idx and word.idx files to:
index.idx
word1 pageid1:hits pageid2:hits pageid5:hits pageid6:hits
word2 pageid2:hits pageid3:hits pageid4:hits
word3 pageid3:hits
that is the word as first column instead of nothing (which in fact
is "line number") and thus avoid loading a word.idx file that grows
O(n) at least?
Maybe the page.idx file should store a unique number (id) for each
page, instead of using the line number in this file as the pageid
(pid). This could let us call a page by its id (pid) and shorten
very long URLs.
--
bug
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist
- References:
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Chris Smith
Other related posts:
- » [dokuwiki] search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Chris Smith