[dokuwiki] Re: Lexer and Regular Expressions

  • From: Christopher Smith <chris@xxxxxxxxxxxxx>
  • To: dokuwiki@xxxxxxxxxxxxx
  • Date: Wed, 4 Jan 2012 14:07:44 +0000

On 4 Jan 2012, at 12:53, Chris Tregenza wrote:

> Hi,
> 
> I'm adding a syntax pluging to handle some special mark up that comes in two 
> forms -
> 
> <card >Some stuff</card>
> 
> <card  some="stuff"  />
> 
> These need different handling so I'm adding patterns the Lexer as -
> 
>    function connectTo($mode) {
>        $this->Lexer->addEntryPattern('<card.*?[^\/]>', $mode, 
> 'plugin_6d6v2_card');
>        $this->Lexer->addExitPattern('<\/card>', 'plugin_6d6v2_card');
> 
>        $this->Lexer->addSpecialPattern('<card.*?\/>', $mode, 
> 'plugin_6d6v2_card');
> 
>    }
> 
> But this throws up an error -
> 
> PHP Warning:  preg_match() [<a 
> href='function.preg-match'>function.preg-match</a>]: Unknown modifier ']' in 
> .../inc/parser/lexer.php on line 115
> 
> and the plugin's handle function is not not triggered.
> 
> The general PHP advice for the error is to use delimiters around the 
> expression but this is not allowed in the Lexer (according to the wiki).
> 
> What regular expression should I be using so that '<card >' will be flagged 
> as an entry pattern but not '<card />'?
> 

There is no need to escape the forward slash.  In this instance its actually 
bad.  The lexer uses forward slashes as pattern delimiters so it escapes them.  
Which probably means your backslash ends up escaping the lexer's backslash, 
leaving your forward slash to terminate the pattern and then the ']' as a 
pattern modifier - hence the error message.

Plus I don't think you want '.' immediately after card.  You will match 
patterns like <cardrivers>  <cardamom/> which I doubt is desired.

Perhaps something like

Entry '<card\s*?>(?=.*?</card>)'
Special '<card .*?/>'

Looking at the regexes in the box plugin might help - 
http://www.dokuwiki.org/plugin:box

- Chris--
DokuWiki mailing list - more info at
http://www.dokuwiki.org/mailinglist

Other related posts: