[dokuwiki] Re: Some performance questions

  • From: Andreas Gohr <andi@xxxxxxxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Sun, 18 May 2008 18:45:59 +0200

On Sat, 17 May 2008 10:42:24 -0700 (PDT)
"Balazs Attila-Mihaly \(Cd-MaN\)" <x_at_y_or_z@xxxxxxxxx> wrote:

> The attached patch makes it so that the function runs in O(1) instead
> of O(n) and satisfies these criteria. Please consider it for
> inclusion if I understood correctly the requirements for it (on our
> dokuwiki installation we use sort by date and it seems to work).

I'm sure chi will have a look at it. Seems to be a reasonable change
for sure.
 
> Now for the more general issue: Dokuwiki touches a lot of file access
> during page rendering. I don't know if this issue is specific to
> pages using the blog (include) plugin, or a more general program, but
> after I applied the attached patch and re-run the xdebug profiler,
> the most "costly" functions (which had the largest number of
> execution * time length of execution values) were file_exists and
> file_get_contents. Is there any way to reduce this? There was a mail
> some time ago where somebody complained that dokuwiki generated too
> much harddisk access and suggested some solutions, however as far as
> I remember there were no followups.

Yann is running DokuWiki in a high load environment, he and some of his
developers are currently working on performance improvements. The main
problem seem to be readdir calls on on directories with a large amount
of files. There are also some places where fileexists calls could be
dropped in favor of checking the return of file open calls. His
proposals are available at
http://wiki.splitbrain.org/wiki:yann:proposals you mind want to add
your own findings and proposal t o that page.

> - are there some "silver bullets" ( :-) ) to cut down on DokuWiki's
> hard disk access needs?

No, just many small changes, that are currently examined.

> - is the include plugin (and the way it work) the source of all these
> accesses or does it simply magnify the issue because it accesses
> multiple pages?
> - is a cache-friendly version of the include plugin planned? (I
> assume that this would need way to get notified when any of the pages
> from a given namespace changes so that the cache can be purged)

Can't say anything about that. Chi and Chris might be able to answer
this better

> - i have a lot of files in the attic directory, however recently I've
> lost the changelog, meaning that the current changelog doesn't
> include those old versions. are they safe to delete (assuming that I
> don't need to revert to versions from there)? would deleting them
> improve the performance?

Yes, removing them is fine and will improve your performance. In fact I
recommend to clean the attic automatically on a regular basis. See
wiki:maintenance

Andi

-- 
splitbrain.org

Other related posts: