On Tue, Jul 3, 2012 at 7:21 AM, Mike Pall <mike-1207@xxxxxxxxxx> wrote: > LuaJIT git HEAD now contains the new allocation sinking and store > sinking optimization. > > This optimization is enabled by default. In case you encounter any > problems and want to check whether they are caused by this > optimization, you can turn it off with: -O-sink > > The optimization is geared towards the elimination of short-lived > aggregates. It handles plain Lua tables as well as FFI cdata (e.g. > structs, complex or short arrays). It also handles elimination of > immutable FFI types that are implicitly boxed (e.g. 64 bit ints or > pointers) in more contexts (e.g. loop-carried variables). > > This optimization adds quite a bit of complexity, so I'd appreciate > it if it would receive wider testing. Feedback welcome! > > Here are a few examples that show the improved performance. The > timings in seconds are for Lua 5.1.5 vs. LuaJIT git HEAD on x86 > (32 bit). Lower numbers are better: > > Typical point class with Lua tables: > > local point > point = { > new = function(self, x, y) > return setmetatable({x=x, y=y}, self) > end, > __add = function(a, b) > return point:new(a.x + b.x, a.y + b.y) > end, > } > point.__index = point > local a, b = point:new(1.5, 2.5), point:new(3.25, 4.75) > for i=1,1e8 do a = (a + b) + b end > print(a.x, a.y) > > 140.0 Lua > 26.9 LuaJIT -O-sink > 0.35 LuaJIT -O+sink *** 400x faster than Lua *** > > Typical point class with cdata struct: > > local ffi = require("ffi") > local point > point = ffi.metatype("struct { double x, y; }", { > __add = function(a, b) > return point(a.x + b.x, a.y + b.y) > end > }) > local a, b = point(1.5, 2.5), point(3.25, 4.75) > for i=1,1e8 do a = (a + b) + b end > print(a.x, a.y) > > 10.9 LuaJIT -O-sink > 0.20 LuaJIT -O+sink *** 700x faster than Lua *** > > 64 bit arithmetic in a loop: > > local x = 0LL > for i=1,1e9 do x = x + 100 end > print(x) > > 45.8 LuaJIT -O-sink (x86) > 40.9 LuaJIT -O-sink (x64) > 0.84 LuaJIT -O+sink (x86) > 0.48 LuaJIT -O+sink (x64) > > --Mike > Very impressive. :) It's performance improvements like this that make me wish I could use LuaJIT at work! /s/ Adam