[dokuwiki] Fwd: performance tweak in Dokuwiki Lexer

I'm forwarding this patch by Mike Reinstein to the list for further discussion.

------snip-----
I was playing around with the dokuwiki source code today and added a
slight performance tweak. It's in the Doku_LexerParallelRegex class
(found in inc/parser/lexer.php)

find this code (around line 144):

list($pre, $post) =
preg_split($this->_patterns[$idx].$this->_getPerlMatchingFlags(),
$subject, 2);

and replace it with this code:

$pos = strpos($subject, $matches[0]);
if($pos !== false)
{
    $pre = substr($subject,0, $pos);
    $post = substr($subject,$pos+strlen($matches[0]));
}

I've profiled it on my Zend Studio and it seems to reduce time spent
in the lexer by about 15%  (on my machine, 0.27 seconds versus 0.23
seconds averaged over 10 tests)
It works because the plain string functions are a bit faster than
using regular expressions for simple string manipulation like this.
Note: not i10n safe, should probably use mb_ functions or whatever
facility dokuwiki provides in the code to handle multibyte strings.
----- snap ------

The code seems to work but we don't have any unit tests for that
function. Chris since you're the author of the original function,
could you have a look at Mike's code and maybe provide a unit test?

BTW. as far as I can see we don't need any multibyte handling here.
The operations should work fine on byte level.

Andi

-- 
splitbrain.org
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist

Other related posts: