[dokuwiki] Re: about the Dokuwiki Lexer/Parser/Handler improvements....

From: Christopher Smith <chris@xxxxxxxxxxxxx>
To: dokuwiki@xxxxxxxxxxxxx
Date: Fri, 10 Apr 2009 11:45:48 +0100


On 9 Apr 2009, at 21:52, Mike Reinstein wrote:


I am interested in achieving 2 goals in the short term:
* improve the the lexing/parsing performance


Sounds good :)

* prototype a parser/lexer based on javascript

Towards those goals, here is what I'm considering:
* take the code from Doku_LexerParallelRegex, Doku_Lexer->addPattern* functions, and all the Doku_Parser related classesand move them into a Doku_Generate_Tokenizer class that wouldcontain the tokenizer logic (doesn't need invokation on each request)* modify Doku_Lexer to accept the data structure generated fromDoku_Generate_Tokenizer as input
I am thinking that a common data interchange format could be used tostore this lexer data. JSON seems like a good choice because it'sbuilt into PHP and can serialize/deserialize those ds pretty fast.JSON is also nice in that it would allow me to import the samelexing data into javascript very easily, and then all that would berequired is to port the remainder of the lexer include the stack andthe traversal. But with the static parts of the lexer/parser movedinto Doku_Generate_Tokenizer, it's only a few hundred lines insteadof a few thousand.

Have you considered, using the regex rather than the information usedto generate the regex? At least in the first instance that would seemto be shorter project as all you need to handle is regex compatibilityand the application of the regex to the wiki content.

again, these are just ideas but I'd love to get feedback from youfolks on if you see this being feasible, and offer the possibilityof contributing this back to the Dokuwiki community if it's desired.Didn't want to start coding it and then find out that a superawesome refactorign of the lexer/parser/renderer were underway. :)
Thoughts? comments/suggestions/feedback/concerns highly appreciated!

I'm not aware of anyone working on the Lexer/Parser/Renderer, its notthe most easy piece of code to get to grips with and it probablydoesn't get the developer love it deserves. Andi is the person mostlikely to know for sure.

One thing to give some thought to is syntax highlighting. GeSHi isn'tpart of the lexer/parser. Its handled entirely in the renderingphase. I guess you would need some ajax to retrieve the highlightedcode snippet.


- Chris
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

References:
- [dokuwiki] about the Dokuwiki Lexer/Parser/Handler improvements....
  - From: Mike Reinstein
- [dokuwiki] Re: about the Dokuwiki Lexer/Parser/Handler improvements....
  - From: cyrille giquello
- [dokuwiki] Re: about the Dokuwiki Lexer/Parser/Handler improvements....
  - From: Mike Reinstein

[dokuwiki] Re: about the Dokuwiki Lexer/Parser/Handler improvements....

Other related posts: