Re: Efficient string parsing

From: GrrrWaaa <grrrwaaa@xxxxxxxxx>
To: luajit@xxxxxxxxxxxxx
Date: Wed, 24 Oct 2012 10:52:13 -0700

Perhaps porting LPeg to a pure Lua(JIT/FFI) implementation? It's one C file, 
2000 loc but a lot of that is Lua/C API. 

I don't know if it will run faster or slower in LuaJIT, but it would remove the 
extra library dependency.

LPeg is powerful and heavily used, so if a pure LuaJIT port is faster, this 
would be a great general purpose contribution. 

On Oct 23, 2012, at 3:15 PM, Wolfgang Pupp wrote:

> Hello list,
> 
> I'm about to implement a simple parser in LuaJIT, and now considering
> on the best approach.
> The target language is somewhat similar to Lua, but more of a
> data-description-thingy (no loops, no conditionals).
> The input chunks will probably always remain <10k lines (much, much
> lower most of the time)- but there's gonna be a lot of them.
> 
> Design criteria are:
> - No dependencies on dynamically linked libraries/C- code (ffi-usage
> should be optional, at most)
> - Not too complicated
> - As fast as possible
> 
> The main criteria is troublesome (I'd immediately go for LPeg if I could :/ ).
> I've recently tested a markdown-parser implementation in LuaJIT, and
> that made me worried about performance (took about 900ms compared to
> <10ms of the C-binding). That code made extensive use of string.*
> functions and string concatenation, which I'm now trying to avoid like
> the plague.
> 
> My current plan is to traverse the input byte-wise, keep some
> context-information (surrounding expression, its start-index, etc.)
> and to detect keywords/operators/... via string.byte- comparison
> instead of string.* functions.
> Am I on the right track? Might I as well take substrings and compare
> those directly (without fear of triggering allocations all the time)?
> 
> I'd be very thankful for any input- I am not too familiar with parser
> design in general, and all the information I found only suggested to
> not use string functions/concatenation since it prevents JIT
> compilation, but no alternatives...
> 
> --Wolfgang
> 
> PS: pointers to relevant/helpful literature on the topic are always 
> appreciated!
>

Follow-Ups:
- Re: Efficient string parsing
  - From: rostislav . sacek

References:
- Efficient string parsing
  - From: Wolfgang Pupp

Re: Efficient string parsing

Other related posts: