Re: Allocation sinking in git HEAD

  • From: Coda Highland <chighland@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Tue, 3 Jul 2012 07:50:19 -0500

On Tue, Jul 3, 2012 at 7:21 AM, Mike Pall <mike-1207@xxxxxxxxxx> wrote:
> LuaJIT git HEAD now contains the new allocation sinking and store
> sinking optimization.
>
> This optimization is enabled by default. In case you encounter any
> problems and want to check whether they are caused by this
> optimization, you can turn it off with: -O-sink
>
> The optimization is geared towards the elimination of short-lived
> aggregates. It handles plain Lua tables as well as FFI cdata (e.g.
> structs, complex or short arrays). It also handles elimination of
> immutable FFI types that are implicitly boxed (e.g. 64 bit ints or
> pointers) in more contexts (e.g. loop-carried variables).
>
> This optimization adds quite a bit of complexity, so I'd appreciate
> it if it would receive wider testing. Feedback welcome!
>
> Here are a few examples that show the improved performance. The
> timings in seconds are for Lua 5.1.5 vs. LuaJIT git HEAD on x86
> (32 bit). Lower numbers are better:
>
> Typical point class with Lua tables:
>
>   local point
>   point = {
>     new = function(self, x, y)
>       return setmetatable({x=x, y=y}, self)
>     end,
>     __add = function(a, b)
>      return point:new(a.x + b.x, a.y + b.y)
>     end,
>   }
>   point.__index = point
>   local a, b = point:new(1.5, 2.5), point:new(3.25, 4.75)
>   for i=1,1e8 do a = (a + b) + b end
>   print(a.x, a.y)
>
> 140.0  Lua
>  26.9  LuaJIT -O-sink
>   0.35 LuaJIT -O+sink *** 400x faster than Lua ***
>
> Typical point class with cdata struct:
>
>   local ffi = require("ffi")
>   local point
>   point = ffi.metatype("struct { double x, y; }", {
>     __add = function(a, b)
>      return point(a.x + b.x, a.y + b.y)
>     end
>   })
>   local a, b = point(1.5, 2.5), point(3.25, 4.75)
>   for i=1,1e8 do a = (a + b) + b end
>   print(a.x, a.y)
>
>  10.9  LuaJIT -O-sink
>   0.20 LuaJIT -O+sink *** 700x faster than Lua ***
>
> 64 bit arithmetic in a loop:
>
>   local x = 0LL
>   for i=1,1e9 do x = x + 100 end
>   print(x)
>
>  45.8  LuaJIT -O-sink (x86)
>  40.9  LuaJIT -O-sink (x64)
>   0.84 LuaJIT -O+sink (x86)
>   0.48 LuaJIT -O+sink (x64)
>
> --Mike
>

Very impressive. :) It's performance improvements like this that make
me wish I could use LuaJIT at work!

/s/ Adam

Other related posts: