Re: Suggestions for how to optimize around "persistent type instability"?

  • From: Mike Pall <mike-1304@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Thu, 4 Apr 2013 20:13:54 +0200

demetri wrote:
> Thanks Dimiter, that gets us down to 5.5x on our test machine;

First, the C and the LuaJIT files are not doing the same thing
(x128 etc. isn't used at all). Also, LuaJIT doesn't need any
warm-up time, so you can omit the first loop. And casts to scalar
number types are rarely helpful -- use the semantics of bit.* to
constrain numbers to integers.

Fixed LuaJIT benchmark attached. Runs at about the same speed as
the C code.


BTW: Please use locals *everywhere* and *everytime*. Make it a
habit. I mean ... you're writing to 8 globals in that short piece
of code ... and it's supposed to be a benchmark ...

I cringe everytime I see this:

  ffi = require("ffi") -- YUCK!

The FFI module doesn't set a global on purpose. And then you store
the result of require in a global, where it's happily overwritten
by the next user of the FFI module ... wheeee. Or someone doesn't
explicitly require it, but uses it and you'll never notice.

The only acceptable way to require the FFI module is this:

  local ffi = require("ffi")

Which is incidentally explained right at the top of:
  http://luajit.org/ext_ffi_tutorial.html

--Mike
local ffi = require "ffi"
local bit = require "bit"
local tobit, shr = bit.tobit, bit.rshift

ffi.cdef [[
typedef uint16_t Dt;
]]

local function rcadd(r, x, y, n) 
  local c = 0
  for i=0,n-1 do
    c = tobit(shr(c, 16) + x[i] + y[i])
    r[i] = c
  end
end

local x128 = ffi.new("Dt[128]")
local y128 = ffi.new("Dt[128]")
local r128 = ffi.new("Dt[128]")

for i=1,1e7 do rcadd(r128, x128, y128, 128) end

Other related posts: