Re: ffi.new vs ffi.C.malloc speed

From: Mike Pall <mike-1210@xxxxxxxxxx>
To: luajit@xxxxxxxxxxxxx
Date: Wed, 10 Oct 2012 17:04:12 +0200

Ronan Collobert wrote:
>  Still reading luajit mailing list, I understand than I can
>  instead use ffi.C.malloc. In which case I need to do three
>  calls: (1) ffi.C.malloc, (2) ffi.cast such that my vector is
>  usable and (3) ffi.gc such that it is properly garbage
>  collected. This works great, but I noticed this process is much
>  slower (~ x10) than the ffi.new way (looks like the GC takes
>  quite a lot of time in that case).

ffi.gc() is not JIT-compiled (yet). So you're seeing the
interpreter overhead for the FFI.

But even if that function were compiled (ETA: tomorrow), the GC
will have to call the finalizer eventually, which is costly.

> See benchmem.lua in attachment:

The question is whether this benchmark says anything about your
use case. I mean ... do you really need to allocate and free ten
million tiny objects (40 bytes each) as fast as possible?

In your intro you're talking about "large floating point vectors".
And I'm assuming they have a considerable lifetime, simply because
they need to be read and written. I doubt the alloc/free speed
matters for this use case.

> Is there any better solution?

Recycle the vectors? It certainly pays off if they are large.
But beware of premature optimization.

> What should I expect in terms of speed improvement in the future
> (say, with the new GC in luajit 2.1)?

Finalizers will become cheaper, but their overhead is still
substantial.

--Mike

Follow-Ups:
- Re: ffi.new vs ffi.C.malloc speed
  - From: Ronan Collobert
- Re: ffi.new vs ffi.C.malloc speed
  - From: Mike Pall

References:
- ffi.new vs ffi.C.malloc speed
  - From: Ronan Collobert

Re: ffi.new vs ffi.C.malloc speed

Other related posts: