After being kicked out of our SAN share at our provider for too heavy usage, then buying our own NFS server and trying to put dokuwiki on it, we understood why we got kicked out :) It seems that dokuwiki is doing many, many file accesses, which would probably go unnoticed when local and with fast disks, but which are completely killing when trying to put dokuwiki on NFS. After some investigation (sigh :D) we found a couple of issues which could be improved. But let's start with the hardware: dokuwiki is running on a dual xeon quadcore 1.6ghz with 8GB of ram, the NFS is a dual 2.8ghz hyperthreaded, 4GB of ram, 10k scsi disks in soft raid 10. Right now, if I put the data of dokuwiki on the NFS, it holds for, well, a good 3 minutes before the load reaches 20 :P So, investigation: we found the following function: /** * Return a list of available and existing page revisons from the attic * * @author Andreas Gohr <andi@xxxxxxxxxxxxxx> * @see getRevisions() */ function getRevisionsFromAttic($id,$sorted=true){ $revd = dirname(wikiFN($id,'foo')); $revs = array(); $clid = cleanID($id); if(strrpos($clid,':')) $clid = substr($clid,strrpos($clid,':')+1); //remove path $clid = utf8_encodeFN($clid); if (is_dir($revd) && $dh = opendir($revd)) { while (($file = readdir($dh)) !== false) { if (is_dir($revd.'/'.$file)) continue; if (preg_match('/^'.$clid.'\.(\d+)\.txt(\.gz)?$/',$file,$match)){ $revs[]=$match[1]; } } closedir($dh); } if($sorted) rsort($revs); return $revs; } If I am not mistaken, this function does a readdir of the attic directory, does a preg_match for every file to see if it is quite like $id, and return the list of revisions for $id. Two things: yann@dongo:/srv/www/fr/doc.ubuntu-fr.org/htdocs/data/attic$ ls -l | wc -l 46426 That's on a 3 years old wiki :) And also: yann@dongo:/srv/www/fr/doc.ubuntu-fr.org/htdocs$ grep -R getRevisionsFromAttic bin/ conf/ inc/ lib/ inc/changelog.php: * @see getRevisionsFromAttic() inc/changelog.php: $revs = array_merge($revs,getRevisionsFromAttic($id,false)); inc/changelog.php:function getRevisionsFromAttic($id,$sorted=true){ This means the function is called only once, in changelog.php, in the function getRevisions($id, $first, $num, $chunk_size=8192), right at the end: $revs = array_merge($revs,getRevisionsFromAttic($id,false)); $revs = array_unique($revs); So what happens exactly: the changes to a particular file are stored in data/meta/ in a file called file.changes, which is a split version of the old changes.log. Dokuwiki parses that file to find out the latest changes that happened to a particular file. But dokuwiki *also* checks for existing files in attic/, eventually elder versions, and merge these revisions to the ones found in the changelog. My suggestion: should we just get rid of these 2 lines? If someone deleted a revision in the changelog, it could be intentionnal, but the revision would still be displayed as the file is still in attic/ :) As this is also the only call to that function, I'd suggest we just get rid of it, or keep it to rebuild changes.log somewhere, but called only from the admin panel... Right now for me this function is doing 45000 getattr() calls, and as many regexp checks :) Second point, the following function: /** * returns an array of full paths to all metafiles of a given ID * * @author Esther Brunner <esther@xxxxxxxxxxxxx> */ function metaFiles($id){ $name = noNS($id); $dir = metaFN(getNS($id),''); $files = array(); $dh = @opendir($dir); if(!$dh) return $files; while(($file = readdir($dh)) !== false){ if(strpos($file,$name.'.') === 0 && !is_dir($dir.$file)) $files[] = $dir.$file; } closedir($dh); return $files; } If I understand it right, it returns an array containing all the metas for a specific file. For this it looks in data/meta, checks for everything that looks like what we are looking for, and adds it to the array it then returns. Comments: yann@dongo:/srv/www/fr/doc.ubuntu-fr.org/htdocs/data/meta$ ls -l | wc -l 6203 Suggestion: Having a quick look at meta/ , it seems that for a given ID you get 3 different files: a .meta, .changes and .indexed. So, just check if these 3 files do exist, and return them in an array? Seems simple maybe I am missing something, don't be too harsh if that's the case :P I will continue to look for improvements - but I think than any readdir() working on attic/, pages/ or meta/ should be get rid of, as it means linear complexity and therefore no scalability as your wiki grows :( Thanks! Yann