[dokuwiki] Re: "memory size" Error

Still reading this list ;) Some thoughts;

The instructions aren't required to be held in memory at the same time
- for the renderer it basically only needs to see one instruction at a
time. Meanwhile in the handler, it sometimes collects a set of
instructions for a short time (e.g. until it sees the closing "tag")
to re-organise them e.g. for lists and tables - but there's no
look-back or jumps. So it should possible to stream the instructions
so you only have a small window in memory at any given time.

Actually looked into ways to solve this problem a long time back by
storing the instructions in a file in a way that would allow them to
be read out in chunks - a pure PHP solution needs a binary format and
everything I tried sucked big time on performance.

IMO the solution would be either;

- using PHP's XML reader, which is able to be both fast and light on
memory i.e. store the instructions as XML - some related reading
http://blog.liip.ch/archive/2004/05/10/processing_large_xml_documents_with_php.html
- use sqlite - store the instructions in a table, one sqlite db per
page - probably the best way to do it

Of course both basically require PHP 5 but if you continue to support
what's being done now, more users would still be happy.


On Thu, Mar 20, 2008 at 1:56 PM, Christopher Smith <chris@xxxxxxxxxxxxx> wrote:
> In addition to my earlier message, I've done a little more
>  investigating.
>
>  - tried an iterator based on array_pop() on the basis that as the
>  input call stack is processed its memory requirements are reduced in
>  step with the increasing memory requirements of the output call stack.
>
>  Virtually no change in memory use.  Significant increase in processing
>  time.
>
>  Tentative conclusion, PHP's shallow copy mechanism is working in DW's
>  favour, changes in processing of call rewriters seem to do no more
>  than achieve the same efficiencies already achieved by PHP at a cost
>  of greater processing time.
>
>  -----
>
>  Although this problem seems to apply to tables.  All of the syntax
>  modes which use call rewriters (e.g. lists, quotes, footnotes) use a
>  similar structure to process their call stacks so are vulnerable to
>  the same issue.  Its possible the problem doesn't seem to occur with
>  these other modes because pages aren't created with extremely large
>  instances of their syntax, ie. big tables are more common and larger
>  than big lists or big footnotes.  The instruction set required to
>  produce a table is more complex that the instruction set required to
>  produce the output of the other modes.  I think that is a marginal
>  issue, the main issue is that when the handler processes a call
>  rewriter's call stack it needs to double the memory requirements of
>  that stack.
>
>  It looks to me that significant improvements in memory usage by the
>  handler will be difficult to achieve without significant changes to
>  the way it operates.
>
>  For now, my best advice is:
>
>  - don't make really large wiki pages, especially ones with really
>  large tables.
>  - if you must have really large tables, try to break them up.
>  Processing several smaller tables requires less total memory than
>  processing a single large table.
>  - increase PHP's memory_limit
>
>  A couple of things to be looked at are:
>  - detailed look at where/how memory is being used in parser/handler/
>  renderer
>  - reducing duplication of instruction strings, say by using an index
>  to a separate table containing the strings
>
>  Andi has opened an item for this in the bug tracker, 
> http://bugs.splitbrain.org/index.php?do=details&task_id=1357
>
>  - Chris
>
>
> --
>  DokuWiki mailing list - more info at
>  http://wiki.splitbrain.org/wiki:mailinglist
>
-- 
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: