Hi Nick,
Thank you so much for the explanation! I replied inline, please see below:
AFAIK LuaJIT does upvalues EXACTLY as PUC Lua.I didn't know that. Thanks for the info! Combining with the other parts of
If GC completed atomic phaseWhat is the "atomic phase"? As far as I know the GC is *not* concurrent,
Note that the number of instructions executed in between FNEW and UCLO isunbounded, you could run a loop for many iterations before leaving the
I don’t think of this as an optimisation; it is rather a question ofsimplicity (GC is incremental and it takes a while before dead objects are
My understanding is that there is a linked list of open upvalues, sortedby slot ordinal, and to create a upvalue one simply traverse the linked
list and insert it at the corresponding place. So I have the following
questions:
AFAIK LuaJIT does upvalues EXACTLY as PUC Lua. See the linked article for
a detailed description of upvalues in PUC Lua
https://www.cs.tufts.edu/~nr/cs257/archive/roberto-ierusalimschy/closures-draft.pdf
1. If I understand it correctly, the linked list is actually weakreference (doesn't guarantee the upvalues will stay alive), and in
'func_finduv', there is logic that checks if the upvalue is dead, and
revive it as necessary. It seems to me this logic can only trigger if one
does f = FNEW(...), and then GC runs and 'f' dies before 'UCLO' is called.
Only in this case we can get a dead upvalue in the linked list. Is that
correct? It seems to me the case where this optimization targets is quite
narrow (the GC must run and collect 'f' between FNEW and UCLO). Did I miss
anything? Is there some optimizer no-escape analysis to make this
optimization more useful in the tracing JIT?
Sounds correct. Create f1, let it perish, stay in the same scope, create
f2 sharing upvalue with f1. If GC completed atomic phase before creating
f2, and f1 was unreachable, f2’s FNEW will encounter dead open upvalue.
Note that the number of instructions executed in between FNEW and UCLO is
unbounded, you could run a loop for many iterations before leaving the
scope and calling UCLO. I don’t think of this as an optimisation; it is
rather a question of simplicity (GC is incremental and it takes a while
before dead objects are purged; if you encountered a dead object, getting
rid of the dead object is extra logic while resurrection is simpler).
The JIT compiler couldn’t trace through FNEW and UCLO the last time I
checked.
2. To create every upvalue, one needs to traverse the linked list, sothis seems to be a O(n^2) algorithm for each FNEW. Is this acceptable
because FNEW with upvalue is the slow path under LuaJIT's assumed use case?
This looks pretty much O(n^2). We should define n though. First, when
creating a closure, we only need to func_finduv for captured variables in
the current function. “Inherited” upvalues are essentially for free. When
searching, we don’t have to traverse the whole list as we terminate as soon
as we encounter an upvalue that points to a stack slot *above* the one we
are interested in.
Supposedly, typical Lua code don’t capture many variables in closures,
hence this was never considered a hot-spot.