Re: Docs for allocation sinking optimization

  • From: Mike Pall <mike-1207@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Thu, 12 Jul 2012 01:36:45 +0200

Geoff Leyland wrote:
> Does allocation sinking put pressure on stack slots?  It seems
> like value stored in an allocation has to be live for the whole
> trace.

Yes, sometimes. The values only have to live until the last
possible point of escape -- which is basically the whole trace in
your example.

> do
>   local x, z = 0, nil
>   for i=1,100 do
>     local t = {i, i+1, i+2, i+3, i+4, i+5}
>     if i == 90 then
>       z = t
>     end 
>     x = x + t[1]
>   end
>   print(x, z[1])
> end
> 
> It looks like t[2..5] are occupying stack slots.  I can see that
> it's hard to work out that they're dead.

Well, you *could* print z[5] after the loop. But the part after
the loop is outside of the analysis scope, anyway.

Ok, one one hand the example is synthetic -- you usually *do*
access the values you store in an aggregate. E.g. for
  local t = {expr1, expr2}
you'll most certainly access t[1] _and_ t[2] afterwards and
on-trace. So expr1 and expr2 need to be computed, anyway.

OTOH one could certainly improve the algorithm a bit. It already
sinks the integer-to-double conversions (numbers stored in a table
in single-number mode are always doubles). One could go one step
further and sink easily rematerializable computations. Maybe only
if the input(s) to that computation are already live at the exit.
E.g. sink 'i+5', if 'i' is live and rematerialize it later.

The compiler folklore tells us that rematerialization is not
always a win. Usually only constants (already handled) or really
simple operations with a maximum of one variable input should be
rematerialized. I.e. ADDOV(i, 5) would qualify.

Anyone who wants to learn a bit about the compiler internals is
welcome to add this feature:

Look how IR_CONV is handled and adapt that to e.g. ADDOV etc.:
- lj_asm.c      asm_snap_alloc1()
- lj_opt_sink.c sink_checkphi()
- lj_snap.c     lj_snap_replay()
- lj_snap.c     snap_restoreval()
- lj_snap.c     snap_restoredata()

Constants are already moved to the right, so you can check for:
  if (irref_isk(ir->op2) && (ir->o == IR_ADDOV || ir->o == ...) &&
      ir_used(IR(ir->op1)))

I'm not sure the ir_used() check is a good idea or not. You can
try without it, but then you'll need to limit recursion in
asm_snap_alloc1(). Two levels, like in CONV(ADDOV(x, k)) should
suffice. E.g. check that only one non-CONV instruction is sunk.

Oh, and you have to be a bit careful with IR_HIOP on FP arithmetic
for LJ_SOFTFTP targets (hi/lo inputs need to be allocated/handled).

--Mike

Other related posts: