[dokuwiki] Re: Changelog rewrite
- From: Ben Coburn <btcoburn@xxxxxxxxxxxxx>
- To: dokuwiki@xxxxxxxxxxxxx
- Date: Fri, 28 Jul 2006 12:57:50 -0700
On Jul 28, 2006, at 12:12 PM, Andreas Gohr wrote:
Ben Coburn wrote:
If no one is working on the changelog code yet, I will follow through
with the suggestion that I posted back in May.
See http://www.freelists.org/archives/dokuwiki/05-2006/msg00483.html
The main problem is that all the changelog items are stored in one
monolithic file. This file gets loaded into memory (and split into
lines) by PHP, at least once every page access. As 'changes.log'
grows without bound I suspect it will eventually clobber PHP.
This is not completely true anymore. If I remember correctly the
changlog isn't read completely anymore but instead in chunks
backwards. So memory usage shouldn't be the problem.
I'm aware of this. (It got quoted more as a matter of context...
however 'inc/common.php#getRevisionInfo' ~line 964, still reads the
whole file.)
The approach I would suggest is to keep one changelog file for each
wiki page. These files could be kept under the 'data/meta/'
directory as '<id>.changes'.
I'm not sure this is really necessary. Most accesses to the changleog
need either only the last edit info (should be stored in metadata) or
the last few lines from the complete log (recent changes). So these
accesses aren't a problem, but the latter case would be much more
complicated with splitted changelogs.
To improve the search through the changelog for a special revison I'd
suggested to implement a binary search some time ago. Because the
changelog is sorted by date searching using this method should be
quite fast.
The "recent changes" would be kept fast with a global changelog cache
that is trimmed to the last day (week, month...) to keep it small.
Showing the recent changes would be as simple as reading in this whole
file for display. (All other changelog access would go through the
per-page changelogs.) Because the global changelog file replicates
data, it can be kept short by trimming it with another "cron" job run
from the indexer web-bug.
What the per-page changelog really helps with is displaying the
"revisions" list. Instead of scanning the attic and then scanning the
changelog, revisions could be generated by just loading the per-page
changelog. Checking if old revisions exist in the attic could even be
put off until the user actually clicks on a link to a missing
revision....
Regards, Ben Coburn
-------------------
silicodon.net
-------------------
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist
- Follow-Ups:
- [dokuwiki] Re: Changelog rewrite
- From: Ben Coburn
- References:
- [dokuwiki] New release and DokuWiki efficiency improvements + GeSHi output caching
- From: Chris Smith
- [dokuwiki] Re: New release and DokuWiki efficiency improvements + GeSHi output caching
- From: Andreas Gohr
- [dokuwiki] Changelog rewrite (was: New release and DokuWiki efficiency improvements...)
- From: Ben Coburn
- [dokuwiki] Re: Changelog rewrite
- From: Andreas Gohr
Other related posts:
On Jul 28, 2006, at 12:12 PM, Andreas Gohr wrote:
If no one is working on the changelog code yet, I will follow through with the suggestion that I posted back in May.
See http://www.freelists.org/archives/dokuwiki/05-2006/msg00483.htmlThe main problem is that all the changelog items are stored in one monolithic file. This file gets loaded into memory (and split into lines) by PHP, at least once every page access. As 'changes.log' grows without bound I suspect it will eventually clobber PHP.
This is not completely true anymore. If I remember correctly the changlog isn't read completely anymore but instead in chunks backwards. So memory usage shouldn't be the problem.
The approach I would suggest is to keep one changelog file for each wiki page. These files could be kept under the 'data/meta/' directory as '<id>.changes'.
I'm not sure this is really necessary. Most accesses to the changleog need either only the last edit info (should be stored in metadata) or the last few lines from the complete log (recent changes). So these accesses aren't a problem, but the latter case would be much more complicated with splitted changelogs.
To improve the search through the changelog for a special revison I'd suggested to implement a binary search some time ago. Because the changelog is sorted by date searching using this method should be quite fast.
------------------- silicodon.net -------------------
- [dokuwiki] Re: Changelog rewrite
- From: Ben Coburn
- [dokuwiki] New release and DokuWiki efficiency improvements + GeSHi output caching
- From: Chris Smith
- [dokuwiki] Re: New release and DokuWiki efficiency improvements + GeSHi output caching
- From: Andreas Gohr
- [dokuwiki] Changelog rewrite (was: New release and DokuWiki efficiency improvements...)
- From: Ben Coburn
- [dokuwiki] Re: Changelog rewrite
- From: Andreas Gohr