[dokuwiki] Re: event proposal : parser_text_parse

  • From: "Harry Fuecks" <hfuecks@xxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Fri, 21 Jul 2006 11:19:37 +0200

Hi Chris,

Been playing around with the lexer design again recently (in fact a
port to Javascript) and think there may be a smart way to solve this
general problem fundamentally, by adding a new method to the lexer API
- addReEntryPattern - here's a Javascript unit test that illustrates
the idea (hopefully close enough to the PHP lexer to make sense);

function testReentryPattern() {
   var l = new lexer('start');
   l.addReentryPattern("\n",'start');

   var testTokens = [
       ["start","",lexer.ENTER,0],
       ["start","aaa",lexer.UNMATCHED,0],
       ["start","\n",lexer.EXIT,3],
       ["start","\n",lexer.ENTER,3],
       ["start","bbb",lexer.UNMATCHED,4],
       ["start","\n",lexer.EXIT,7],
       ["start","\n",lexer.ENTER,7],
       ["start","ccc",lexer.UNMATCHED,8],
       ["start","",lexer.EXIT,11]
   ];

   assertTokensEqual(l.parse("aaa\nbbb\nccc"), testTokens, 'Reentry Pattern');
}

What's happening is, on encountering the _single_ re-entry pattern,
the lexer emits _two_ tokens - first an EXIT and then an ENTRY - so
it's basically a toggle.

It's fairly easy to add this to the lexer although for dokuwiki
"handler" may be a problem, as it's trying to be smart about line
breaks.

Anyway - hope that's useful.

Harry

On 7/21/06, Chris Smith <chris@xxxxxxxxxxxxx> wrote:
Hi,

In order to fix a problem with the line break plugin[1] I need to ensure
a space occurs between text and line endings in the preparsed wiki data.
I can do this by using the new IO_WIKIPAGE_READ event[2] but that means
I am modifying the data when its being read for other purposes besides
parsing for display.

I think there is probably enough separation between the two uses to
warrant an event which can supply preprocessing of raw wiki data along
these lines[3] as it will also catch any inline uses of the parser.

event: PARSER_TEXT_PARSE  (or perhaps PARSER_TEXT_GETINSTRUCTIONS)
data:  raw wiki text
action: parse the text and generate the instructions list
preventable: ? probably
signalled: p_get_instructions
result: instruction list

Any comments?

Cheers,

Chris

[1] to avoid messing up double new lines the plugin can't grab a new
line when its the first character being processed - which also occurs
when dokuwiki syntax occurs immediately prior to a line break.

[2] great addition Ben.

[3] the recent <IF***> discussion could also make use of access to the
data immediately before it is parsed to strip out any content that
shouldn't be included.

--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: