[dokuwiki] Re: Changelog rewrite


On Jul 28, 2006, at 12:12 PM, Andreas Gohr wrote:

Ben Coburn wrote:
If no one is working on the changelog code yet, I will follow through with the suggestion that I posted back in May.
See http://www.freelists.org/archives/dokuwiki/05-2006/msg00483.html
The main problem is that all the changelog items are stored in one monolithic file. This file gets loaded into memory (and split into lines) by PHP, at least once every page access. As 'changes.log' grows without bound I suspect it will eventually clobber PHP.

This is not completely true anymore. If I remember correctly the changlog isn't read completely anymore but instead in chunks backwards. So memory usage shouldn't be the problem.

I'm aware of this. (It got quoted more as a matter of context... however 'inc/common.php#getRevisionInfo' ~line 964, still reads the whole file.)



The approach I would suggest is to keep one changelog file for each wiki page. These files could be kept under the 'data/meta/' directory as '<id>.changes'.

I'm not sure this is really necessary. Most accesses to the changleog need either only the last edit info (should be stored in metadata) or the last few lines from the complete log (recent changes). So these accesses aren't a problem, but the latter case would be much more complicated with splitted changelogs.


To improve the search through the changelog for a special revison I'd suggested to implement a binary search some time ago. Because the changelog is sorted by date searching using this method should be quite fast.


The "recent changes" would be kept fast with a global changelog cache that is trimmed to the last day (week, month...) to keep it small. Showing the recent changes would be as simple as reading in this whole file for display. (All other changelog access would go through the per-page changelogs.) Because the global changelog file replicates data, it can be kept short by trimming it with another "cron" job run from the indexer web-bug.


What the per-page changelog really helps with is displaying the "revisions" list. Instead of scanning the attic and then scanning the changelog, revisions could be generated by just loading the per-page changelog. Checking if old revisions exist in the attic could even be put off until the user actually clicks on a link to a missing revision....

Regards, Ben Coburn


------------------- silicodon.net -------------------

--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: