alleviate the load of the GC

  • From: Laurent Deniau <Laurent.Deniau@xxxxxxx>
  • To: "luajit@xxxxxxxxxxxxx" <luajit@xxxxxxxxxxxxx>
  • Date: Wed, 2 Sep 2015 09:49:04 +0000

Hi All,

<note>
I know that language extension for Lua should be discussed on Lua mailing list,
but since the project (if the transition from Mike to the community succeeds)
will be more open than ever, it might be also interesting to make proposals for
the evolution of Lua(JIT) through tested implementation in LuaJIT. This is
common in the C++ world, that the ISO committee prefers to adopt new proposals
where feasibility demonstration exists. It means that someone already tackled
the details...
</note>

Question:
How difficult would it be to patch luajit in order to implement the metamethod
__assign? I have very little knowledge of the internals of LuaJIT and I tried
to look at how __add and other operators where handled without success.

Definition:
__assign would be overloading the '=' operator if it is defined either for the
lhs or the rhs (same as for '+', '-', '*', ...) and would return the value to
effectively assign.
The default behaviour would be the identity:
function __assign(lhs, rhs)
return rhs
end
and would be used as:
lhs = __assign(lhs, rhs)
Of course, the identify should be optimised out...

Motivation:
Overloading other operators (beyond syntactic sugar) leads to some optimisation
like sharing temporaries, building lazy expressions, or flatten expressions
that alleviate a lot the GC. C++ libraries are using these kind of
optimisations for decades now with success. To be a bit provocative,
overloading operators has little sense without being able to overload '=',
except basic syntactic sugar. The operator '=' should be seen as a component of
the expression (or statement), and not as a special case, and there is no
reason to not be able to overload it.

Experience:
I have observed with LuaJIT (and other GC-ed languages including C/C++ and
BoehmGC) that I can get a speed up of x10-100 when I can properly manage and
share temporaries within expressions but this requires some _local_
"finalisation" when the results is assigned somewhere. The 'where' is not
important (local, global, key, whatever) because what matters is to know if the
user can reuse (assigned the resulting temporaries) or not (no access to the
temporary), that is if the result is semantically anchored. Without the reuse
of temporaries, the GC gets quadratically slower (specially with large objects
not sinked for LuaJIT).

Best,
Laurent.

Other related posts: