[dokuwiki] Re: Acronym bug

On 9/15/05, Chris Smith <chris@xxxxxxxxxxxxx> wrote:
>
> > > - if I change my dev install to use '\b' I get the same results as
> > > splitbrain.  If I use current boundary setting I get same results as
> > > my reference install.
> >
> > It originally used \b as boundary but this made problems with UTF-8
> > words.
>
> In looking into this I notice the perl flags used by the lexer don't include
> "u" for utf-8.  I guess that is deliberate but I thought I'd check to make
> sure :)

Well spotted. Hmmm - that might be something to screen the existing
parser modes for - without the /u flag (which isn't used at this time)
certain regex behaviour is controlled by your server's locale settings
e.g. what is regarded as a word character. It may be possible to add
the /u flag (in lexer.php - look for "perlMatchingFlags") but it might
break things - having the unit tests there would be important. I guess
there will be very few cases though where it's a problem - might be
easier to review the the existing modes and the regexes they use.
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: