Re: To which extent LuaJIT is specific to Lua

  • From: Leo Romanoff <romixlev@xxxxxxxxx>
  • To: "luajit@xxxxxxxxxxxxx" <luajit@xxxxxxxxxxxxx>
  • Date: Sat, 30 Nov 2013 00:20:35 -0800 (PST)

Mike, 

Thanks a lot for all your comments, explanations and clarifications on this 
subject!

I'll have a closer look at LuaJIT's implementation and eventually come back 
with more specific questions, if you don't mind.

Thanks again,
 -Leo




> Mike Pall <mike-1311@xxxxxxxxxx> schrieb am 22:44 Freitag, 29.November 2013:
> > Leo Romanoff wrote:
>>  OK. I'm wondering how much your tracing infrastructure implementation 
> is dependent on Lua semantics and its execution model? I.e. could the tracing 
> implementation mostly be reused (e.g. taking traces, detecting what and where 
> should be traced, storing traces, purging traces, etc)? Obviously certain 
> parts 
> of this are almost always language/runtime/execution model dependent, but I 
> think that a lot of things are also pretty generic, or?
> 
> You may have to adapt it to the predominant control structures of
> the source language and re-tune the heuristics. But, yes, most of
> it is quite generic.
> 
>>  > OTOH the more recent work, e.g. for the FFI, builds upon lower-level
>>  > parts of the IR. It would be entirely possible to generate IR for
>>  > a C-like language right now, but it would be harder to deal with
>>  > the differences in the execution model.
>> 
>>  Could you elaborate a bit more on these differences and related 
> difficulties?
> 
> Simply speaking, the JIT compiler doesn't produce regular C
> function prologues, since it doesn't have a need for that. Or it
> limits the acceptable amount of code per function or number of
> simultaneously live variables per function. These limits are fine
> for Lua, but they'd need to be lifted for heavily macro-infested
> C code or notorious manual loop unrolling. ;-)
> 
>>  While LOCs are not a very good measure of complexity, but how big is LuaJIT 
> now?
> 
> Depends on how you'd want that to be counted against which
> sub-systems. Well ... you know where to find the source. ;-)
> 
>>  And what would be your estimate (either in number of LOCs or percentage of 
> code to be changed/added to the core LuaJIT) for retargeting it for a new 
> language/execution model, e.g. for some examples mentioned in my original 
> message?
> 
> Sorry, but due to the many factors in such a calculation the error
> margin is too high that I'd dare to give even a conservative
> estimate.
> 
>>  OK. Does it mean that it is possible to implement e.g. the usual stack 
> frames a-la C/Pascal for keeping the local variables there?
> 
> That's not how LuaJIT (or any modern C compiler) works. C stack
> slots are allocated when the register allocator needs to spill
> values, not variables. There's no direct correspondence between
> variables and stack slots.
> 
>>  It is possible to implement "address of" operator and pass 
> variable's address to other functions?
> 
> Sure, if the semantics of the source language allow that. Such a
> feature is well known to limit optimization opportunities for the
> compiler. But this is in no way specific to a trace compiler.
> Actually, alias analysis is much easier on isolated traces, so
> this should work out fine.
> 
>>  Is it possible to implement most of C low-level tricks (bit operations, 
> unions/structs, type casts, etc)? So, basically it should be possible to 
> build a 
> full tracing JIT for C using LuaJIT?
> 
> The FFI and the bit library allows most of these ops. The ones
> missing are still present in the IR or could easily be added.
> 
>>  >>  - something like JVM-based languages, e.g. Java, Scala? You said 
> yourself
>>  > that LuaJIT beats JVM in many cases.
>>  >
>>  > The core challenge is proper implementation of the execution model
>>  > wrt. concurrency or the GC. And some specific optimizations for
>>  > allocations, since Java programs tend to allocate temporaries like
>>  > there's no tomorrow (collateral damage from the language and
>>  > library design).
>> 
>>  Interesting. Do you mean that a not-so-efficient implementation is possible 
> with a moderate effort, but an efficient one with a good GC would really be a 
> challenge?
> 
> The runtime environment is a big part of the Java language
> specification. The shared-everything model does hurt. Read about
> the Java memory model and how they had to refine it several times
> to get somewhat sane semantics a compiler could follow.
> 
>>  But my question is: Do you say that LuaJIT could be tweaked with a 
> reasonable effort to replace PyPy by providing a full tracing JIT for Python?
> 
> Possibly. Not that I'd be interested in working on that, though.
> 
>>  I see. But I'm wondering if there is anything in JVM or e.g. Python or 
> Ruby which cannot be easily mapped to or expressed with LuaJIT? I.e. do know 
> any 
> examples where certain core data types, data structures or may be certain 
> language constructs/features/low-level details cannot be principally 
> expressed 
> using LuaJIT and should be modeled only rather inefficiently, using some 
> workarounds.
> 
> There are plenty that cannot be easily or efficiently modeled in
> the source language of LuaJIT plus its extensions. OTOH modeling
> them on top of the IR (with some changes) is certainly feasible.
> 
>>  - How modular is LuaJIT when it comes to retargeting it for a different 
> language or runtime? How configurable is it? For example, are Lua-specific 
> optimizations and other implementation details provided in a few well-defined 
> places or are they spread all over the place?
> 
> The code base is simply not designed for that.
> 
>>  - Are most of the things which heavily depend on the language semantics 
> implemented in mostly orthogonal ways or are they very deeply 
> inter-dependent? 
> How easy/difficult is it to replace implementation of one of such features 
> with 
> an alternative implementation? E.g. if one would like to add a new core data 
> type (e.g. C-like array or complex numbers,
> 
> LuaJIT already has C-like arrays and complex numbers. :-)
> 
>>  or a different implementation of tables/dicts, which is not compliant with 
> Lua's tables)?
> 
> That would be harder. You could decompose it into low-level ops or
> add new mid-level IR instructions plus add the support for all
> backends. MIR offers strictly better optimization opportunities
> due to a lower semantic loss (which is why I'm using that for e.g.
> Lua hash table operations).
> 
>>  - I'm wondering if you ever considered making LuaJIT more generic and 
> more modular (in this sense, a bit like LLVM), so that it can serve as a 
> basis 
> for tracing JITs for different languages and runtimes? I understand that it 
> was 
> not the initial goal, but still... I'm not even suggesting that you should 
> do it yourself, if there is no interest from your side. But would you be in 
> favor of such a LuaJIT development direction? Could you roughly estimate the 
> effort (e.g. amount of redesign and/or refactoring) required to make LuaJIT 
> an 
> easier and more approachable target for other languages? Would you consider 
> developing future LuaJIT changes taking these multi-language support 
> considerations into account?
> 
> I'd never have finished LuaJIT if I had not opted to write a very
> Lua-specific compiler. Right now I don't have the time to
> participate in developing a generic compiler framework. Good luck!
> 
> 
> --Mike
>

Other related posts: