In reference to your last question: ever heard of PyPy/RPython? Leo Romanoff <romixlev@xxxxxxxxx> wrote: >Mike, first of all, thanks a lot for your very insightful answers! > >> Mike Pall <mike-1311@xxxxxxxxxx> schrieb am 19:15 Freitag, >29.November 2013: > >> > Leo Romanoff wrote: >>> Is it possible to use (a correspondingly extended) LuaJIT as a >generic >> tracing JIT for other languages with different semantics and >execution models? >> Or is LuaJIT so tightly coupled to Lua, its semantics and execution >model that >> it is almost impossible to reuse it for something significantly >different from >> Lua without rewriting it almost completely? >> >> You can certainly adapt the overall design and take many parts of >> it. Depends a bit on how close the language is to Lua. > >Yes. I suspected this ;-) > >>> I understand that LuaJIT was initially created specifically for >Lua. But >> I'm wondering to which extent LuaJITs inner organization, design and >> implementation are tied to the semantics of Lua. I can imagine that >some parts >> of LuaJIT (e.g. machine code generation, register allocation, some >generic >> optimizations) are not that much dependent on Lua and its semantics, >while some >> other parts (e.g. some of optimizations) are rather tightly coupled >to Lua, >> because they are only possible if Lua semantics is assumed (e.g. >table indexing >> starts with 1, strings are interned, etc). >> >> Two key design decision that make it difficult to reuse all of the >> code is the medium-level IR and the stack snapshots. These reflect >> both the semantics of Lua (e.g. hash tables) and its execution >> model. > >OK. I'm wondering how much your tracing infrastructure implementation >is dependent on Lua semantics and its execution model? I.e. could the >tracing implementation mostly be reused (e.g. taking traces, detecting >what and where should be traced, storing traces, purging traces, etc)? >Obviously certain parts of this are almost always >language/runtime/execution model dependent, but I think that a lot of >things are also pretty generic, or? > >> OTOH the more recent work, e.g. for the FFI, builds upon lower-level >> parts of the IR. It would be entirely possible to generate IR for >> a C-like language right now, but it would be harder to deal with >> the differences in the execution model. > >Could you elaborate a bit more on these differences and related >difficulties? > >> Interestingly, this is the >> opposite of what you'd be facing when trying to retarget (say) >> LLVM to a dynamic language, because it makes various assumptions >> about the execution model that are geared towards compiling C/C++. > >Indeed. I see your point. >BTW, speaking of LLVM I'd like to mention that it is pretty easy to >experiment with it, extend it, etc. >It is very modular and most of important concerns are clearly separated >and abstracted. I'm not saying that it produces better results than >LuaJIT. I'm just saying that it was developed with extensibility, >modularity and configurability in mind, which is a good thing when it >comes to developing new features on top of it. > >But see my related questions at the end of this mail. > >>> So, I'm wondering which parts of LuaJIT are generic and which ones >are >> tightly based on Lua semantics? The reason for this question: I'd >like to >> understand better if LuaJIT could be used as a tracing JIT backend >for something >> very different from Lua in its semantics. I understand that it is >most likely >> not possible out of the box and would require adaptations and >extensions of >> LuaJIT. But the question is - how much work would it be? Would one >need to >> rewrite almost all of LuaJIT or may be there are only few places in >the code >> that are very much dependent on the language semantics and therefore >need to be >> adjusted/changed to meet the semantics of a different language? And >I'm >> really in using LuaJIT directly, without mapping the original source >language to >> Lua first, even though it could be possible in some cases. >> >> I'm not sure I could quantify this. > >While LOCs are not a very good measure of complexity, but how big is >LuaJIT now? And what would be your estimate (either in number of LOCs >or percentage of code to be changed/added to the core LuaJIT) for >retargeting it for a new language/execution model, e.g. for some >examples mentioned in my original message? > >I totally understand that counting LOCs in such a complex piece of >software like a compiler or JIT is not a very good approximation, as >every line has more complexity than 100s of lines in an average >application. I know it very well, because I've developed optimizing C >compilers in my former life. But due to this experience I can roughly >translate "compiler LOCs" into the required effort. This is why I ask >about it. Actually, my former experience is also the reason why I >started this thread. As an expert in the compiler development area, I >have a very deep respect for your work on LuaJIT. IMHO, it is really a >pity that this wonderful technology is currently used only for Lua. I >think with some adaptations/changes it may be applied in a much broader >area with a great success. > > >> Maybe ask Thomas Schilling. He >> wrote a trace compiler for Haskell whose code is based on the core >> parts of the LuaJIT interpreter and trace compiler. Heavily >> modified, of course: >> >> >http://cp.reddit.com/r/haskell/comments/1r4s7b/what_happened_to_the_tracing_jit_work_by_thomas/ > >Interesting. I've never heard about this attempt. I'll have a look at >it. > >>> Some examples of the languages and features I have in mind are: >>> - some sort of statically typed language like C/Pascal/etc >> >> Entirely possible. > >OK. Does it mean that it is possible to implement e.g. the usual stack >frames a-la C/Pascal for keeping the local variables there? It is >possible to implement "address of" operator and pass variable's address >to other functions? Is it possible to implement most of C low-level >tricks (bit operations, unions/structs, type casts, etc)? So, basically >it should be possible to build a full tracing JIT for C using LuaJIT? > >>> - something like JVM-based languages, e.g. Java, Scala? You said >yourself >> that LuaJIT beats JVM in many cases. >> >> The core challenge is proper implementation of the execution model >> wrt. concurrency or the GC. And some specific optimizations for >> allocations, since Java programs tend to allocate temporaries like >> there's no tomorrow (collateral damage from the language and >> library design). > >Interesting. Do you mean that a not-so-efficient implementation is >possible with a moderate effort, but an efficient one with a good GC >would really be a challenge? > >>> - some scripting languages, which have a different internal >> object/execution model, e.g. Python or Ruby >> >> Sure, Python has PyPy. > >Sure. I understand that there is a JIT for Python, so it is possible to >write a (tracing) JIT for it. >But my question is: Do you say that LuaJIT could be tweaked with a >reasonable effort to replace PyPy by providing a full tracing JIT for >Python? > >> The difficulties you'll be facing there >> have more to do with legacy C interfaces, the abundance of >> specialized core data types or wasteful execution semantics. One >> pays for abstractions, one way or another. > >I see. But I'm wondering if there is anything in JVM or e.g. Python or >Ruby which cannot be easily mapped to or expressed with LuaJIT? I.e. do >know any examples where certain core data types, data structures or may >be certain language constructs/features/low-level details cannot be >principally expressed using LuaJIT and should be modeled only rather >inefficiently, using some workarounds. > >>> - features like: custom object layouts in memory (e.g. C-like >structs vs >> Lua's tables), custom garbage collectors, support for custom ABIs. >> >> Code reuse should be easy if you base it on the same FFI design. >> >> Custom garbage collectors are troublesome for any VM or compiler. >> Sounds nice on paper and certainly interesting for research. But >> IMHO only a fully integrated GC offers top performance. > >Yes. I understand that GC is very tightly coupled with the language and >its runtime. > >Based on your responses, I'd also like to ask the following questions: > >- How modular is LuaJIT when it comes to retargeting it for a different >language or runtime? How configurable is it? For example, are >Lua-specific optimizations and other implementation details provided in >a few well-defined places or are they spread all over the place? > >- Are most of the things which heavily depend on the language semantics >implemented in mostly orthogonal ways or are they very deeply >inter-dependent? How easy/difficult is it to replace implementation of >one of such features with an alternative implementation? E.g. if one >would like to add a new core data type (e.g. C-like array or complex >numbers, or a different implementation of tables/dicts, which is not >compliant with Lua's tables)? > >- I'm wondering if you ever considered making LuaJIT more generic and >more modular (in this sense, a bit like LLVM), so that it can serve as >a basis for tracing JITs for different languages and runtimes? I >understand that it was not the initial goal, but still... I'm not even >suggesting that you should do it yourself, if there is no interest from >your side. But would you be in favor of such a LuaJIT development >direction? Could you roughly estimate the effort (e.g. amount of >redesign and/or refactoring) required to make LuaJIT an easier and more >approachable target for other languages? Would you consider developing >future LuaJIT changes taking these multi-language support >considerations into account? > >Thanks, > -Leo -- Sent from my Android phone with K-9 Mail. Please excuse my brevity.