[dokuwiki] Some performance questions
- From: "Balazs Attila-Mihaly \(Cd-MaN\)" <x_at_y_or_z@xxxxxxxxx>
- To: Dokuwiki Devel List <dokuwiki@xxxxxxxxxxxxx>
- Date: Sat, 17 May 2008 10:42:24 -0700 (PDT)
Hello all.
Dokuwiki has been great as an internal documentation platform, however recently
I've been having some performance problems with it. Because this is related to
both Dokuwiki in general and the blog plugin in particular, I decided to post
it here, since I know that chi reads this mailing list, so please excuse me if
the content of the mail isn't 100% dokuwiki-core related.
I loaded up XDebug and did a profiling of a request. I found that one
problematic function was _unique_key from the blog plugin (this was taking up
around 30% of the request time). As far as I can tell the purpose of this
function is the following:
- to generate unique keys (to ensure that the returned keys are different and
unique, even if the input parameters are the same)
- the sort order of the generated keys should be alphabetic (or numeric if the
inputs are numbers) and if the inputs coincide, the key generated at the first
call should be "less" that the key generated at the second call
The attached patch makes it so that the function runs in O(1) instead of O(n)
and satisfies these criteria. Please consider it for inclusion if I understood
correctly the requirements for it (on our dokuwiki installation we use sort by
date and it seems to work).
Now for the more general issue: Dokuwiki touches a lot of file access during
page rendering. I don't know if this issue is specific to pages using the blog
(include) plugin, or a more general program, but after I applied the attached
patch and re-run the xdebug profiler, the most "costly" functions (which had
the largest number of execution * time length of execution values) were
file_exists and file_get_contents. Is there any way to reduce this? There was a
mail some time ago where somebody complained that dokuwiki generated too much
harddisk access and suggested some solutions, however as far as I remember
there were no followups.
In conclusion my questions would be:
- are there some "silver bullets" ( :-) ) to cut down on DokuWiki's hard disk
access needs?
- is the include plugin (and the way it work) the source of all these accesses
or does it simply magnify the issue because it accesses multiple pages?
- is a cache-friendly version of the include plugin planned? (I assume that
this would need way to get notified when any of the pages from a given
namespace changes so that the cache can be purged)
- i have a lot of files in the attic directory, however recently I've lost the
changelog, meaning that the current changelog doesn't include those old
versions. are they safe to delete (assuming that I don't need to revert to
versions from there)? would deleting them improve the performance?
Best regards and thank you for your patiente.
__________________________________________________________
Sent from Yahoo! Mail.
A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html73a74
> $unique_keys_memoize = array();
111c112
< $key = $this->_uniqueKey($key, $result);
---
> $key = $this->_uniqueKey($key, $unique_keys_memoize);
221c222
< function _uniqueKey($key, &$result){
---
> function _uniqueKey($key, &$unique_keys_memoize){
224,237c225,230
< if (is_numeric($key)){
< while (array_key_exists($key, $result)) $key++;
< return $key;
<
< // append a number to literal keys
< } else {
< $num = 0;
< $testkey = $key;
< while (array_key_exists($testkey, $result)){
< $testkey = $key.$num;
< $num++;
< }
< return $testkey;
< }
---
> if (is_numeric($key))
> $key = sprintf('%08x', $key);
> if (!array_key_exists($key, $unique_keys_memoize))
> $unique_keys_memoize[$key] = 0;
>
> return sprintf('%s_%s', $key, $unique_keys_memoize[$key]++);
- Follow-Ups:
- [dokuwiki] Re: Some performance questions
- From: Andreas Gohr
Other related posts:
- » [dokuwiki] Some performance questions
- » [dokuwiki] Re: Some performance questions
- » [dokuwiki] Re: Some performance questions
- » [dokuwiki] Re: Some performance questions
- [dokuwiki] Re: Some performance questions
- From: Andreas Gohr