[dokuwiki] Re: search improvements
- From: Thanos Massias <tm@xxxxxxxxxxx>
- To: dokuwiki@xxxxxxxxxxxxx
- Date: Thu, 31 Aug 2006 14:45:10 +0300
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Chris Smith wrote:
> Hi,
>
> [snip]
>
> The analysis has thrown up another factor in search performance, the
> location of the search term(s) within the word index.
> For my test wiki two similar search terms producing similar results, but
> selected from opposite ends of an 11,000+ word index resulted in a
> doubling of the search time. Guy found similar results for single
> search terms at opposite ends of his ~10,000 word index. It would seem
> the bigger the wiki, the more words likely to get in the index, the
> slower, on average, searching is likely to be. Ideas on improving this
> are welcome :-)
>
You could try splitting the word data but this will get quite
complicated. For example you could some hashing by splitting the word
index in a series of datafiles depending, say on the first character (or
more if we are talking about huge wikis) of the word. Then, given a
search word you only have to search the relevant word index file.
For example words with latin characters in my wiki are as follows:
Starting Letter
________Number of words per starting letter
________________Total words divided by number of words per starting
________________letter
a 314 14.1
b 176 25.2
c 449 9.9
d 289 15.3
e 215 20.6
f 183 24.2
g 107 41.4
h 110 40.3
i 248 17.9
j 22 201.5
k 34 130.4
l 157 28.2
m 235 18.9
n 107 41.4
o 135 32.8
p 334 13.3
q 20 221.6
r 283 15.7
s 463 9.6
t 252 17.6
u 84 52.8
v 83 53.4
w 102 43.5
x 10 443.2
y 13 340.9
z 7 633.1
The last column is indicative of what the improvement could be. Not much
for 'c' or 's' starting words but huge for 'z' or 'x' starting ones.
If you also have numbers and non-latin alphabets, the improvement goes
even further.
On the other hand this should be a PITA to code and I understand that.
- --
Best regards,
Thanos Massias
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFE9svGSy9m2i8jedwRApMAAJ9WkvZZof9i6yW+YFbsTKo8S1QLvQCfbTWt
qwTNalE9TZerFvC84wzwjn0=
=6xr1
-----END PGP SIGNATURE-----
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist
- Follow-Ups:
- [dokuwiki] Re: search improvements
- From: Guy Brand
- References:
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Chris Smith
Other related posts:
- » [dokuwiki] search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Guy Brand
- [dokuwiki] Re: search improvements
- From: Chris Smith