[dokuwiki] Re: Performance and caching

  • From: "Joe Lapp" <joe.lapp@xxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Tue, 06 Sep 2005 20:48:39 -0500 (CDT)

Thanks for the explanation, Andi.  That makes sense.  But I still disagree on 
the issue of server capacity.

I was thrown by the "$renderer->info['cache'] = false;" feature that seems to 
turn page caching on or off for the entire page.  I figured that if you're 
refreshing at the page level but caching at the syntactic element level, then 
caching would be far from optimal.  I didn't snoop around the code; is there a 
separate $info[] array for each syntactic element?  Is this what you're saying?

However, I do want to clarify one thing.  Every time you reduce resource usage 
you necessarily improve the **peak load** capacity.  For example, assembling 
the page requires resources (CPU time, heap, disk access) that can be reduced 
by not assembling the page.  In this case, I expect peak load to go up 
significantly, because you're doing work when the alternative is to do almost 
nothing (read on).

I suspect you're right that the **time** required to load a page, on average, 
would not change much by going to page-level caching, provided that the server 
is not running near capacity.  There would be little change to off-peak 
performance.  The issue is at what point does the server approach capacity, 
where connections start to become unavailable or clients sporadically start 
timing out?

So there's the question of what would have to happen in order to do page-level 
caching and whether this would use significantly fewer resources.  I agree that 
a cached dynamic page will not contain real-time results -- it will be stale 
but some period of time, usually just minutes.  If you want real-time behavior 
equivalent to what you offer now (if I understand you properly), you'd have to 
turn caching off for that page.  But if a page is static or if a page can live 
with being stale by 2 or 3 minutes, you can make a page retrieval use little 
more resources than would by required by the web server retrieving the page 
directly.

Again, the issue is server resources, not round-trip time.  With page-level 
caching you would only load the PHP source files needed to decide whether a 
page can be retrieved or must be refreshed.  Asking the server to load only a 
few PHP files also itself significantly reduces server resources and improves 
peak capacity.  If the user is logged in and the page needs to contain 
user-specific info, then the page would not be drawn from cache but instead 
rebuilt via the existing cached-instructions mechanism -- providing some 
caching benefit even there.  (Likewise for other no-cache signals.)

Unless I'm missing something, page-level caching should greatly improve the 
load capacity of DokuWiki servers whose clients are primarily read-only 
anonymous visitors.  I suspect that for most installations, this will be the 
limiting characterization, and page-level caching would help tremendously.

I think it's the difference between allowing DokuWiki to be used on 
professional, high-volume sites versus remainingly largely a cool hobbiest tool.

And I do think DokuWiki is a cool hobbiest tool.  I'm just a perfectionist with 
big aspirations.

Best,
~joe

P.S. I think the best way to deal with this is to mock up an 
imperfect-but-close page-level caching scheme, and then to compare peak load 
benchmarks (not page-gen times) before and after.

----- Start Original Message -----
From: Andreas Gohr <andi@xxxxxxxxxxxxxx>
To: dokuwiki@xxxxxxxxxxxxx
Subject: [dokuwiki] Performance and caching (was: Roadmap for next release)

> On Mon, 05 Sep 2005 17:24:29 -0500 (CDT)
> "Joe Lapp" <joe.lapp@xxxxxxxxx> wrote:
>  
> > Have you seen my gibbery mumbo jumbo in the performance discussion?
> > http://wiki.splitbrain.org/wiki:discussion:performance
> 
> Yes I read it but hadn't the time to answer. I'm not sure if pagelevel
> caching would increase performance very much.
> 
> You've read about the twostage caching and wondered what it is for. Let
> me explain how the current caching works. Rendering a wiki page consists
> of two parts: parsing (creating instructions) and rendering (creating
> xhtml). Both tasks are timeconsuming. But the data both tasks create are
> very different in how long it is valid.
> 
> Instructions are only dependent on a single page (their source) and as
> long as the source isn't changed the instructions do not change. This
> means the instruction cache is only expired when it's source is changed,
> which occurs relatively seldom.
> 
> The output (XHTML in our current case) is dependent on it's instructions
> (from step 1) but on other pages as well. For example a link
> becomes red or green depending on it's target. So we need to expire all
> xhtml cache files if a page is added or removed. We also need to add a
> cache timeout for things that need to be updated periodically (eg. RSS
> feed inclusions). You see the xhtml cache isn't very durable.
> 
> You now understand why there are two stages in caching. If a final cache
> is available it is quickest to use it, but this xhtml cache may be
> already out of date. But we can still rely on the instructions cache to
> save half of the full rendering time.
> 
> Okay now back to your proposal of adding another "whole page" cache. It
> wouldn't save any speed on rendering the content it self, as this is
> already covered by the mechanisms mentioned above. It would only save
> some time on all the things that happen around the content. But to make
> sure the user never gets a wrong page we would have to do a lot of
> checks and exeptions. We would need to do the checks already done for
> the current xhtml cache, we would need to check the authentication,
> breadcrumbs shouldn't be cached, we would need to provide a way to
> exclude stuff from the cache for templates and so on...
> 
> I hope I made somehow clear why I don't think a global page cache would
> improve much. However I _do_ think there is lots of room for improvement
> performance wise. Many things could probably done more effective and I
> would be happy for all patches that would speed up certain functions.
> 
> Andi
> 
> -- 
> http://www.splitbrain.org
> -- 
> DokuWiki mailing list - more info at
> http://wiki.splitbrain.org/wiki:mailinglist
> 

----- End Original Message -----
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: