On Fri, Dec 12, 2014 at 12:46 PM, Mike Pall <mike-1412@xxxxxxxxxx> wrote: > Alexander Gall wrote: >> The main question is whether the code can be written in a manner that >> avoids this effect. > > local sum = ffi.new("int64_t[1]") > ... > sum[0] = sum[0] + ... > > You get an extra store on every inner loop iteration, but no more > allocations on every outer loop iteration. Whether that's faster > or not depends on the iteration count of the inner loop vs. the > outer loop. Thanks for detailed explanation. I'm actually using this method in other parts of my code but failed to realise that it can be applied here as well. -- Alex