Re: Efficient query timeouts in LuaJIT

  • From: Konstantin Osipov <kostja@xxxxxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Tue, 30 Apr 2013 18:42:54 +0400

* Mike Pall <mike-1304@xxxxxxxxxx> [13/04/30 17:13]:

> > We're desperately looking for ways to implement efficient lua_call()
> > and lua_pcall() timeouts. Yes, they are easily implementable with
> > a debug hook. However, running a hook on every instruction (or few
> > instructions) is not best for performance. Ideally, hooks which
> > check for timeouts need not have granularity lower than tens of
> > microseconds.
> You can dynamically set the hook from a signal or a different
> thread (lua_sethook is the only thread-safe call). To be effective
> with LuaJIT, you'd need -DLUAJIT_ENABLE_CHECKHOOK, too. But check
> the caveats at the end of lj_record.c.

In other words, hook check emission is compile-time disabled by
default, since hook checks consume CPU cycles even if there are no
hooks.

Are there performance tests comparing LuaJIT compiled with and
without -DLUAJIT_ENABLE_CHECKHOOK? If not, we may need to create a
benchmark for this.

> > Ideally, it would be great to integrate LuaJIT into our
> > virtual machine, so that it's possible to do a reduction (context
> > switch) while preserving the context of a running call.
> 
> I'm not sure hooks are the best way to go then. Because they may
> interrupt execution *anywhere*, possibly in an undefined or
> inconvenient state (e.g. a lock is held). Often you can only abort
> then. And you have to be careful that the hook doesn't hit when
> you're in unrelated code. Prepare for lots of strange bug reports,
> caused by the non-deterministic nature of this approach.

Is it safe to call Lua error() (or lua_error()) from the hook?
Is there any other way of interrupting execution without
breaking the lua_State?

> Much more reliable semantics (and better performance) will be
> achieved by polling a (shared) memory location with the FFI. So
> add this at defined places in your code (check_ptr is an int *):
> 
>   if check_ptr[0] ~= 0 then ... end
> 
> Then you can safely do a context switch, e.g. a yield.

Unfortunately, we don't control all the code that runs in Query
execution timeout is a feature requested by database
administrators, whereas Lua stored procedures are usually written
by database users, and users make mistakes.

For any server binding or ffi call we already provide a
synchronous cancellation API. Now we need to solve the same
problem for cases when user code has run away completely.

Thank you,

-- 
kostja
http://tarantool.org - an efficient, extensible in-memory data store

Other related posts: