Re: Help to understanding a case of innefective allocation sinking

  • From: Mike Pall <mike-1412@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Fri, 12 Dec 2014 12:46:37 +0100

Alexander Gall wrote:
> I'm trying to understand why the following code (distilled from a
> real-word use case) contains an unsunk allocation:

The second trace feeds back the int64 to the setup part of the
first trace. Since the setup part unboxes the int64, it needs to
be boxed by the second trace before joining to the first one.

> If I understand correctly, the "sum" variable is basically stored in
> slot #4.

The boxed variable is stored in the slot.

> Then why does trace 2 use PVAL to refer to it?

Because it has been modified and not yet boxed and stored back to
the slot (that's the effect of allocation sinking).

A cdata int64 is immutable, so it has to be unboxed/boxed at every
modification. The JIT-compiler can eliminate that in most cases,
but not in all of them.

> I don't know if it's related, but I'm
> also puzzled that index 0002 is marked as being referred to by a PHI
> instruction when there is none in that trace. It appears to be
> inherited from index 0048 in the parent trace, but why?

PHI marks on individual instructions don't mean anything for the
instruction itself (as opposed to PHI instructions). But these
marks influence register allocation -- there's a strong preference
to not spill or rename the register, which is generally beneficial.

> The main question is whether the code can be written in a manner that
> avoids this effect.

local sum = ffi.new("int64_t[1]")
...
  sum[0] = sum[0] + ...

You get an extra store on every inner loop iteration, but no more
allocations on every outer loop iteration. Whether that's faster
or not depends on the iteration count of the inner loop vs. the
outer loop.

--Mike

Other related posts: