Re: small script to reproduce bogus trace stitch errors at line 0 with coroutines in latest 2.1
- From: "Claire Lewis" <claire_lewis@xxxxxxxxxxx>
- To: <luajit@xxxxxxxxxxxxx>
- Date: Thu, 30 Apr 2015 08:35:43 +0930
Hi Mike,
Firstly, good news on finding that bug (and also thanks to Elias!)
Just thought I'd add a few random thoughts, apologies if this is completely
off-base or useless.
When a trace has to stop at a NYI function, it compiles an exit to
the interpreter with a continuation underneath of it in the stack.
When the function later returns, this continuation either triggers
recording of a new trace that continues the control flow or it
directly jumps to that trace, if already compiled.
* What if it were limited to one such active continuation at a time? The
state (interpreter, not coroutine) could remember where that continuation is
for any future trace flush, then invalidate it then and there.
This is a little kludgy, but maybe an easy solution for the current
(already-kludgy?) stitching pending a rethink, and may (?) reduce overall
performance in some cases - but then wouldn't those cases be handled by a
subsequent pass (assuming said code doesn't get blacklisted for too many
compilation attempts, in which case the code probably could use rethinking
anyway..)
Finding and cleaning up all of these continuations with stale data
would require a full GC scan, which is unfeasible. Invalidating
the data is tricky. Maybe some kind of generation number, which is
incremented after each flush (but it might wrap). Or some
completely different idea ...
* Looking briefly at some structures, it appears the trace number is stored
16 bit - and perhaps it and the generation number would need to be able to
to stored in 2x16bit in the op fields? Though I'm not sure.
If the limit would be 16+16, that wrap may be a real concern. Perhaps
rather than a generation number, one could use a 32 bit trace number with a
"last flush trace head" value remembered each flush (i.e. the oldest ID that
will still be valid for continuation, by recording what the next traceid
would be at flush, and not resetting it). This extends that wrap point to
4bn traces rather than 64k generations. There may be some penalty in having
to subtract off the head value when indexing into things, though I'm, not
sure that traces are ever stored in an array.
* Is a GC scan unfeasible in that it can't be done, or that it would be a
prohibitive performance penalty?
If it is the latter, what about using a generation sequence, then if that
ever wraps, do a full GC scan? It would of course cause a pause tat that
point, but no more so than repeated trace fail could.
* Lastly, does it even need to store a raw trace ID for continuation at all?
Could it not store some monotonic incrementing continuation ID that could be
looked up in a list which would detail corresponding trace id for
continuation, and any other details needed? Adds a level of indirection,
but I doubt evaluating a continuation is a common or particularly
performance sensitive thing - once it compiles, it all goes away I assume.
This would then allow invalidation of the continuation list by the trace
flush, attempting to close a continuation will simply fail to find it on the
list (or find it marked deleted, whatever seems best), clean up and carry
on.
Again there is a wrap chance here, but assuming the continuation can be 32
bit, having 4bn continuations is getting pretty pathological. If it hits
that, maybe just turn off trace stitching for this state and go home :)
Anyway, just some random thoughts, hope some of it might prove useful.
Thanks,
- Claire
Other related posts: