RE: Access C env per LUA instance

  • From: Dibyendu Majumdar <dibyendu.majumdar@xxxxxxxx>
  • To: "luajit@xxxxxxxxxxxxx" <luajit@xxxxxxxxxxxxx>
  • Date: Mon, 4 Nov 2013 00:21:03 +0000

Hi,
 
It seems that what I am doing is not recommended - i.e. invoke small Lua 
functions from C many times. The requirement I have unfortunately is such that 
I cannot avoid this. In the product I am working on we want to allow users to 
specify functions in Lua ... so unfortunately the callback mechanism is the 
only option. This is somewhat similar to the numerical integration example that 
is advised against in the FFI semantics web page. 
 
I find that if I do a Release build using standard Lua and compare performance 
with LuaJIT I do not see much difference. I assume that the reason for this is 
the callback overhead is negating any performance benefits. 
 
The implementation is further complicated because the Lua function called from 
C, must itself call C functions to do some things. I tried using FFI instead of 
standard Lua mechanism to call these C functions, but for some reason I don't 
understand, this performs worse than the standard mechanism. On the other hand, 
returning the Lua closure as a function pointer to the C code results in better 
performance than the mechanism I was using before (described in my post below).
 
The FFI mechanism is astonishing of course. I am able to receive a C structure 
and assign a Lua closure to a function pointer in the structure - Wow!
From: dibyendu.majumdar@xxxxxxxx
To: luajit@xxxxxxxxxxxxx
Subject: RE: Access C env per LUA instance
Date: Tue, 29 Oct 2013 23:59:49 +0000




Mike wrote:
> If you need top performance and you're calling lots of C functions,
> then you need to switch from the classic Lua/C API to the FFI.
> But this implies you cannot get at the lua_State pointer from a C
> function called by the FFI, anyway. So I wouldn't bother with any
> temporary workarounds that mess with the lua_State or such.

Ok understood.

> For the FFI approach there are two solutions:
> 
> a) Do it on the C side: use one of the faster thread-local storage
> (TLS) APIs, i.e. one that doesn't call a function for every TLS
> access. The optimum is a mov reg, fs:[fixed_tls_offset]. How to
> achieve that is different for every OS and C compiler.
>
 
I have been looking at using thread local storage. However, don't know enough 
about how TLS is implemented to work out which API is best.

> b) Classic OO approach: pass a context argument to every function.
> If the context is (say) a pointer or an integer in an immutable
> upvalue then the JIT compiler will inline that as a constant for
> each C call.
> 

Thanks - this seems easiest ... I will try this.  
 
I have another question.
 
As I need to call the LUA functions many times from the C code - I am doing 
this:
I am calling a LUA function to pre-compute values where possible and create a 
closure with the bare minimum steps.
Then I store the closure in the registry by calling:
luaL_ref(L, LUA_REGISTRYINDEX);
 
Subsequently I execute the closure by calling it from C code - this is done 
many times ... and the operations performed by the closure are numeric. 
 
I am hoping that the closure will be JITed first time and then execute at 
native speed later ... is that a correct assumption?
 
Regards 

                                                                                
  

Other related posts: