[dokuwiki] Re: Scalability testing: content needed
- From: Christopher Smith <chris@xxxxxxxxxxxxx>
- To: dokuwiki@xxxxxxxxxxxxx
- Date: Fri, 21 Nov 2008 10:57:47 +0900
On 21 Nov 2008, at 05:11, holmberg_jason@xxxxxxx wrote:
Hi list,
I'm evaluating DokuWiki for potential use with a 2000+ topics project.
I'm interested in learning how the Search would perform under those
conditions and would like to test it myself with my available
hardware.
Can anyone recommend a good way to get 2000 topics of content without
too much scripting to convert existing HTML-based content? Is there a
difference between performance time of HTML content and native wiki
content?
I dream of a random wiki topic generator with a zip file download,
but I
wake to find no such thing exists... :)
What do you mean by a topic? a page?
Search indexing was revamped at the end of 2006. You should find some
notes in this list's archive about the improvements made at that
time. There may also be some messages on real world experiences of
the changes after the release in March 2007. Whole word searching is
quite quick, partial searching (e.g. doku* or *wiki) not so fast.
From a search perspective, you don't require "wiki syntax" just
content. Any group of 2000 text files should do. DokuWiki indexes
(and therefore searches) the raw wiki text rather than the rendered
output.
A google search for site:www.dokuwiki.org suggests there are 3390
pages in the wiki, that's probably inaccurate as it will include each
plugin tag, but it puts www.dokuwiki.org at around the scale your
after. It shouldn't be too difficult to construct a spider to grab
the wiki content for each page. Or maybe if you ask nicely Andi will
send you an archive :)
- Chris
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist
Other related posts: