Re: LuaJIT in realtime applications

  • From: Luke Gorrie <lukego@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Mon, 23 Jul 2012 07:25:09 +0200

Thanks for the feedback! I'll let you know how my experience goes.

On 21 July 2012 12:28, Mike Pall <mike-1207@xxxxxxxxxx> wrote:

> Luke Gorrie wrote:
> > I'm interested in using LuaJIT in a realtime application. In particular
> I'd
> > like to use LuaJIT code instead of iptables-style patterns in a network
> > forwarding engine, and I'd like to be able to establish some upper-bounds
> > on processing time for my own peace of mind. For example, to be able to
> be
> > confident that rules of a certain basic complexity level would never take
> > more than (say) 50us to execute.
> >
> > Is this a realistic notion?
>
> With soft-realtime you need to specify more parameters: the worst
> case latency under 'usual' operating conditions, the tolerable
> worst case latency under averse conditions and the acceptable
> probability of that happening. And of course the often forgotten
> maximum acceptable bandwidth consumed by the memory allocator and
> the garbage collector.
>
> But the real question is: do you want to find these parameters for
> a specific implementation (with LuaJIT) or do you have strict
> bounds for these numbers and want to shape the implementation to
> match them?
>
> > I'm guessing that GC is the main issue to be concerned about.
>
> As Thomas already said: avoiding allocations is the simplest
> recipe.
>
> But the incremental GC in LuaJIT 2.0 (same as Lua 5.1) is not that
> bad. It does have some atomic pauses that may be of concern:
>
> - Stacks are traversed atomically -- don't create huge stacks
>   (deep recursion).
>
> - Each table is traversed atomically -- don't create huge tables
>   (millions of elements). Or consider using FFI structures.
>
> - Tables that hit a write barrier will be remarked atomically --
>   this is usually not an issue, unless they are huge (see above).
>
> - The list of userdata objects is traversed atomically -- don't
>   create too many of them. Or consider using FFI cdata.
>
> - Userdata and FFI cdata finalizers may be invoked on any GC
>   checkpoint -- don't create long-running finalizer functions.
>
> IMHO it's pretty easy to avoid these issues in your code. [The
> planned new GC for LuaJIT 2.1 will eliminate most of these pauses
> or try to reduce their impact.]
>
> You can reduce the length of each incremental GC step with the
> "setstepmul" parameter. But note that your throughput will suffer
> if the value is too low. You really need to measure the GC step
> duration within your application, since it depends a lot on the
> mix of objects, cache behavior etc.
>
> The builtin allocator is a variant of dlmalloc. I'm sure someone
> else has already figured out the worst case pauses this might
> incur.
>
> The JIT compiler is mostly incremental, too. The recording phase
> (which invokes most optimizations on-the-fly) is fully incremental.
> There are some non-incremental phases, like the LOOP, SPLIT and
> SINK optimization passes (these are pretty fast) and the backend
> assembler. They are linearly bounded by the maximum trace size
> (-Omaxtrace=x). We're talking about a couple of microseconds, so
> this shouldn't be too much of a concern.
>
> I'm leaving out the discussion of OS/cache latencies here, since
> you need to take care of these, anyway.
>
> --Mike
>
>

Other related posts: