Re: LuaJIT-friendly API and data structure design

  • From: Mike Pall <mike-1502@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Thu, 19 Feb 2015 17:25:00 +0100

Luke Gorrie wrote:
> Howdy!

First, please don't cross-post between mailing lists. This never
works out well. Please decide for one or the other.

> I start to wonder if this is bad programming style: a Lua function that is
> returning a new pointer.

Then return an index, which is to be used relative to an array.

Apart from the performance issue with unsunk allocations, this
solves all kinds of other issues:

1. No dangling pointers, much easier to debug.

2. Bounds checks are simple (enable for debugging only).

3. You can shrink or expand the base array as needed, without
fixing up all pointers.

4. The indexes themselves can be stored more compactly than a full
pointer. This helps with cache-sensitive workloads (e.g. the
internal LuaJIT IR uses 16 bit references throughout).

> One more downside is that this would not be universal. There are other
> situations where we need to use pointers less predictably and where this
> technique would not apply.

Not sure about the best solution for this. The overhead of
boxing/unboxing seems to be much harder to eliminate than the
literature makes you believe. Actually, it's the allocations that
really hurt, but the boxing is the root cause.

No matter how sophisticated your compiler is, how fast your GC is
or how many recycling or caching tricks you add -- boxing will
hurt you, eventually.

Case in point: the Lua VM always had unboxed doubles, which gave
it a competitive advantage over other interpreted scripting
languages for e.g. scientific workloads (that is ... 10 years ago,
when it was interesting to compare the performance of interpreters).

Ok, so I'm nowadays of the opinion that a VM should avoid boxing
at all cost. Not just for scalars (numbers, pointers), but also
for some aggregates (e.g. complex numbers or small structs). That
would allow a VM to compete with languages that permit stack
allocations and unboxed struct passing.

Alas, this is non-trivial to implement for a dynamic language.
I've previously mentioned the only solution I could come up with
(variable-sized stack slots with early type specialization), but
that's a larger undertaking.

--Mike

Other related posts: