FFI array performance

  • From: Simon Cooke <sjcfwd@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Fri, 25 May 2012 16:08:32 -0400

I've been trying out the FFI library recently, and have tested the
variable-length array feature with mixed performance results. For
native types (e.g. float, double) it works very efficiently, but for
simple structs the performance drops dramatically, by ~ 50x.

The following test case demonstrates this for arrays of native
'double' and an equivalent struct 'boxed':

-----------------------------------------------------------------
local ffi = require("ffi")

ffi.cdef[[
void *malloc(size_t size);
void free(void *ptr);
typedef struct { double x; } boxed;
]]

local function test(T)
    local N = 2^26
    local arr = ffi.new(T.."[?]", N)
    local c = ffi.new(T,10)
    local t = os.clock() ; for i = 0,N-1 do arr[i]=c end ; t = os.clock() - t
    print(T..' :','for i = 0,N-1 do arr[i]=c end : ',t ..'s', t/N*1e9
..' ns/element')
end

print(jit.version)
test'double'
test'boxed'
-----------------------------------------------------------------

Using VC++10 (x64) on Windows 7 (3.33 GHz Xeon) this gives:

LuaJIT 2.0.0-beta10
double :        for i = 0,N-1 do arr[i]=c end :         0.085s  1.2665987014771 
ns/element
boxed : for i = 0,N-1 do arr[i]=c end :         5.039s  75.086951255798 
ns/element

Essentially the same work is being done in each case, but it seems
that JIT compilation is being prevented for the 'boxed' case. Is this
to be expected, or is it something that might be possible to fix?

Thanks,
Simon

Other related posts: