Alexander Gall wrote: > I'm trying to understand why the following code (distilled from a > real-word use case) contains an unsunk allocation: The second trace feeds back the int64 to the setup part of the first trace. Since the setup part unboxes the int64, it needs to be boxed by the second trace before joining to the first one. > If I understand correctly, the "sum" variable is basically stored in > slot #4. The boxed variable is stored in the slot. > Then why does trace 2 use PVAL to refer to it? Because it has been modified and not yet boxed and stored back to the slot (that's the effect of allocation sinking). A cdata int64 is immutable, so it has to be unboxed/boxed at every modification. The JIT-compiler can eliminate that in most cases, but not in all of them. > I don't know if it's related, but I'm > also puzzled that index 0002 is marked as being referred to by a PHI > instruction when there is none in that trace. It appears to be > inherited from index 0048 in the parent trace, but why? PHI marks on individual instructions don't mean anything for the instruction itself (as opposed to PHI instructions). But these marks influence register allocation -- there's a strong preference to not spill or rename the register, which is generally beneficial. > The main question is whether the code can be written in a manner that > avoids this effect. local sum = ffi.new("int64_t[1]") ... sum[0] = sum[0] + ... You get an extra store on every inner loop iteration, but no more allocations on every outer loop iteration. Whether that's faster or not depends on the iteration count of the inner loop vs. the outer loop. --Mike