Re: LuaJIT-friendly API and data structure design

  • From: Luke Gorrie <luke@xxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Fri, 20 Feb 2015 08:00:40 +0100

On 19 February 2015 at 17:25, Mike Pall <mike-1502@xxxxxxxxxx> wrote:

> Luke Gorrie wrote:
> > Howdy!
>
> First, please don't cross-post between mailing lists. This never
> works out well. Please decide for one or the other.
>

Next you will be suggesting I shouldn't HTML-format my mails... ;-)


> > I start to wonder if this is bad programming style: a Lua function that
> is
> > returning a new pointer.
>
> Then return an index, which is to be used relative to an array.
>

That is a definite possibility for the special case of packets.

One of the reasons that we kept our packet API looking like
'module.function(object, ...)' was to allow object to be a number, as
opposed to 'object:method(...)' which would require the object to be
something with a method table.

> One more downside is that this would not be universal. There are other
> > situations where we need to use pointers less predictably and where this
> > technique would not apply.
>
> Not sure about the best solution for this. The overhead of
> boxing/unboxing seems to be much harder to eliminate than the
> literature makes you believe. Actually, it's the allocations that
> really hurt, but the boxing is the root cause.
>

How about Justin's trick for reusable FFI boxes?

That is, Lua variables would not be assigned to immutable raw FFI pointers
($ *) but rather reassignable FFI boxes as either arrays ($[1]) or structs
(struct { $ *value; }).

These boxes would be allocated statically or perhaps via freelists.

The user would need to be aware that the boxes are not immutable values and
that reassigning them updates the existing box object rather than returning
a new one. However, the details of the box itself could be handled inside
the API.

For example with packets we could write:

-- get a reusable box for referring to packets
-- (This is a heap allocation and we don't do this in inner loops)
local p = packet.newbox()
-- make 'p' point to a newly received packet:
link.receive(mylink, p)
-- access the packet, letting 'packet' module worry about the box
print(packet.length(p))
-- alternative syntax, with metatable methods on the box type
print(p:length())
-- now process the rest of the packets with p
while link.receive(mylink, p) do work(p) end

This API seems reasonably intuitive and uncluttered to me. (I would be
bothered if users of the API had to write [0] all over the place.) There is
no dependency on allocation sinking optimization because no pointers are
being stored directly in Lua variables.

This also seems like a solution that could generalize fairly well e.g. if
we had a dozen FFI types that we wanted to box.

(I am conscious that Alex has a "told you so" waiting for me once I finally
work my way through all of this in my own brain... :-))

Questions:

How is the efficiency of this FFI boxing compared with Lua boxing? I am
expecting it to be much better because assigning a new value to an FFI box
is morally a MOV instruction and the GC will not be involved.

How about the relative efficiency of the two API styles, 'packet.length(p)'
vs 'p:length()'? I have occasionally experimented with such things and so
far have not seen a meaningful difference -- both seem to be very fast. I
don't feel that I understand the full picture though. (I am a bit torn
between liking the explicitness of the first style vs the brevity of the
second.)

Other related posts: