ffi.new vs ffi.C.malloc speed

From: Ronan Collobert <ronan@xxxxxxxxxxxxx>
To: luajit@xxxxxxxxxxxxx
Date: Wed, 10 Oct 2012 16:43:03 +0200

Hi,

Assuming I want to deal with large floating point vectors, it looks to me I 
cannot use ffi.new. Reading luajit mailing list, I understand ffi.new arrays 
are constrained to the 2GB luajit memory management limit. So, a simple...

> = ffi.new('float[?]', 1000000000)
bad argument #1 to '?' (size of C type is unknown or too large)

...which tries to allocate slightly less than 4GB, fails.

 Still reading luajit mailing list, I understand than I can instead use 
ffi.C.malloc. In which case I need to do three calls: (1) ffi.C.malloc, (2) 
ffi.cast such that my vector is usable and (3) ffi.gc such that it is properly 
garbage collected. This works great, but I noticed this process is much slower 
(~ x10) than the ffi.new way (looks like the GC takes quite a lot of time in 
that case). See benchmem.lua in attachment:

time luajit benchmem.lua vla
real    0m1.505s
user    0m1.500s
sys     0m0.000s

time luajit benchmem.lua malloc
real    0m13.928s
user    0m13.140s
sys     0m0.770s

Is there any better solution? What should I expect in terms of speed 
improvement in the future (say, with the new GC in luajit 2.1)?

Thanks,
Ronan.

Follow-Ups:
- Re: ffi.new vs ffi.C.malloc speed
  - From: Mike Pall

ffi.new vs ffi.C.malloc speed

Other related posts: