Adam Strzelecki wrote: > Just wanted to confirm whether this technique will remove heap > allocations for this kind of block > > for i = 0, 1000 -- some tight loop > local value = model[i] * view > gl.UniformMatrix4fv(location, 1, gl.TRUE, value.gl) > end > > Where model & view are 4x4 matrices of custom FFI metatype > having __mul operator. Well, no, because 'value' *does* escape. There's no way for the compiler to know that gl.UniformMatrix4fv() doesn't store the 'value' pointer somewhere else. > Now this code above produces heap allocation (and deallocations) > upon each iteration, regardless that value does not escape from > the loop. Only you, as the programmer, have this knowledge from reading the OpenGL docs. And, no, there's currently no way to tell the compiler about it. In C, you'll have to manually select stack allocation for the value variable (but that's not a feasible approach for Lua). > So will allocation/store sinking remove this allocation and > store the temporary result in some fast preallocated memory like > stack? Not this one. But other cases work: local x,y,z = 0,0,0 for i=1,1000 do local t = { x = i, y = 2*i, z = 3*i } -- Sunk allocation. x = x+t.x; y = y+t.y; z = z+t.z if i == 500 then print(t) -- Allocated only for uncommon side exit. end end Usually parts of that would be in different functions, but this doesn't matter for a trace compiler. Most vector-style APIs generate and consume lots of temporaries. This approach should eliminate most of them. You can use tables or FFI structs or (small) FFI arrays. > Does this technique has some limits (drawbacks)? Will stop > working for long call chains (recursive calls)? It works for loops and tail-recursion. But not for regular recursion, since the objects usually escape to the Lua stack. --Mike