Re: Make the VM Lua-version-agnostic and modularize

  • From: "Soni L." <fakedme+lj@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Thu, 24 Sep 2015 13:39:00 -0300



On 24/09/15 11:12 AM, Vyacheslav Egorov (Redacted sender vegorov for DMARC) wrote:

I'd say making a VM that supports multiple incompatible language versions at the
same time is a really wrong way to do this.

You have conditions checking "runtime version" scattered across the code,
complicating any and all attempts to reason about semantics.

Additionally bytecode space (and interpreter code size) is rather finite
resource and wasting it on supporting version specific bytecodes does not feel
right either.

For 5.3 one of the challenges is figuring out the way to deal with integer
values.

You can't fit the whole int64 range into the TValue - so if you want to keep
TValue representation you'll have to box it (similar to what FFI does) and hope
that JIT manages to sink the boxing. However performance of the interpreter will
drop.

You can start boxing "upper" part of the int64 range keeping "middle" part that
fits into TValue unboxed - however this introduces complexity and still leads to
unexpected performance characteristics because some operations just tend to
produce out of range int64 values (e.g. bitwise manipulation is a common
culprit).

To make things worse: for performance reasons LuaJIT already has a
DUALNUM mode -
in which it tries to keep floating point numbers from int32 range represented as
int32 values - now if you want a VM that supports 5.1 and 5.3 and performs well
for both on ARM you'll find yourself in the spot where you suddenly
have floating point
values represented as either floating point value or an integer and
you also have
integers represented as either boxed or unboxed integer.

This is the level of complexity we are talking about here and it is
not desirable.

// Vyacheslav Egorov

TValues can fit doubles. Surely they CAN fit int64s as well.

At most you'll need bitop opcodes, a new LEN opcode that respects __len (emitted by the Lua 5.2 and Lua 5.3 parser/loader), maybe opcodes for the Lua 5.2 gt/lt/le/ge (which call metamethods even for different types), and new stuff on the stdlib (require"5.2" would return a table similar to the default _G, but with some things added (such as table.(un)pack) and others removed (such as getfenv/setfenv) and others replaced (such as pairs/ipairs which would respect __pairs/__ipairs, etc). Even assuming the opcodes don't have a free argument, that's only very few opcodes!

The registry would have 3 "global environments" instead of 1: one for Lua 5.1, another for Lua 5.2, another for Lua 5.3.

Most functions can be unified, e.g. you don't need 3 separate print()s, just use the same one!
Even the parser can be unified! Just pass some function pointers around! (e.g. pointer for the operation parsers (Lua 5.3), pointer for the LEN opcode emitter (Lua 5.2 and 5.3), etc; alternatively use flags, altho they tend to be a bit less maintainable...)

From what I'm seeing there's not much added complexity: everything is (mostly) already in.

Other related posts: