[dokuwiki] Re: search improvements

  • From: Chris Smith <chris@xxxxxxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Sun, 27 Aug 2006 02:39:14 +0100

Chris Smith wrote:


It worse than that.

I have sent through a patch for a fourth algorithm. It calculates the snippet based on utf8 character counts rather than the byte counts of the "opt1" and "opt2". My early profiling suggests it is slower than "opt2" but marginally faster than "opt1" and much faster than the "orig".


ft_snippet avg times, milliseconds (one search term, run once for each, 14 calls per run)

         #1    #2
opt2:     5.0   4.7
utf8:     7.1   6.3
opt1:     8.2   5.5
orig:   117    83

Guy Brand is doing some more detailed tests. If the above results hold up, I think the utf8 algorithm is the way to go. Differences in execution times are negligble over whole page and benefits are pretty good for DokuWiki users with predominantly multibyte utf-8 text.

Chris

PS.
The change for wikiFN wasn't as successful as I hoped, it only cut out ~20% of the cleanID calls, from 478 calls to 391 and about 100ms on a 1000ms page generation. But it still leaves cleanID as costing 400ms of the remaining 900ms, so there maybe scope for further improvement.





-- DokuWiki mailing list - more info at http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: