[dokuwiki] Re: search improvements
- From: Chris Smith <chris@xxxxxxxxxxxxx>
- To: dokuwiki@xxxxxxxxxxxxx
- Date: Sun, 27 Aug 2006 00:04:52 +0100
Chris Smith wrote:
Maybe ... though I have forgotten my original reasoning - I guess it
may have been flawed, here goes...
The context selection amounts - are only 50 bytes if I use substr().
If use utf8_substr() they would be utf-8 characters. But then when I
come to plug the offset back into preg_match, I don't know the byte
amount. That would mean using two utf8_substr(), one for the "pre"
snippet and one for the "post" snippet, so that I could then run
strlen on the match + post snippet to ascertain the new amount for
offset.
It worse than that. Because I only have a byte offset, I need to
convert that into a character offset somehow - in order to work out the
position in the string which the match occurs. That probably means
using preg_split rather than preg_match and utf8_strlen on the first
portion of the split. All very very messy.
--
DokuWiki mailing list - more info at
http://wiki.splitbrain.org/wiki:mailinglist
- Follow-Ups:
- [dokuwiki] Re: search improvements
- From: Chris Smith
- References:
- [dokuwiki] search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
Other related posts:
- » [dokuwiki] search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
- » [dokuwiki] Re: search improvements
The context selection amounts - are only 50 bytes if I use substr(). If use utf8_substr() they would be utf-8 characters. But then when I come to plug the offset back into preg_match, I don't know the byte amount. That would mean using two utf8_substr(), one for the "pre" snippet and one for the "post" snippet, so that I could then run strlen on the match + post snippet to ascertain the new amount for offset.
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith
- [dokuwiki] Re: search improvements
- From: Andreas Gohr
- [dokuwiki] Re: search improvements
- From: Chris Smith