ffi.cast to "type *" for C callbacks causes crash if used many times

  • From: "Adrian Smith" <triode1@xxxxxxxxxxxxxx>
  • To: <luajit@xxxxxxxxxxxxx>
  • Date: Sat, 11 Aug 2012 17:36:46 +0100

Hi,

I have found that ffi.cast(“<type> *”, func) rather than ffi.cast(“<type>”, func) for C callbacks can cause segfaults after multiple calls. Even if the resulting type is always the same it appears to create calls to ctype_new() and additional allocation of type ids within luajit which eventually causes a segfault after a C callback is processed.

I’m using ffi.cast so I can free the C callback as my code creates multiple new callsbacks based on existing callbacks firing.

The following is a minimal test case.

#!./luajit
-- load ffi
local ffi = require("ffi")
local C = ffi.C

-- load test library
ffi.cdef([[
typedef void (func)(void);
void register_cb(func *cb);
void processevent(void);
]])

local callbacks = ffi.load("./cbtest.so")

local cb
local lua_cb
lua_cb =
function()
 print("in callback")
 -- do something useful
 cb:free()
 cb = ffi.cast("func *", lua_cb)
 callbacks.register_cb(cb)
end

-- start
cb = ffi.cast("func *", lua_cb)
callbacks.register_cb(cb)

while true do
callbacks.processevent()
end

Gives the following after running for a while:

Program received signal SIGSEGV, Segmentation fault.
0x080b7d3a in ctype_child (ct=0xb7715658, cts=0x8427050) at lj_ctype.h:415
415       lua_assert(!(ctype_isvoid(ct->info) || ctype_isstruct(ct->info) ||
(gdb) bt
#0 0x080b7d3a in ctype_child (ct=0xb7715658, cts=0x8427050) at lj_ctype.h:415
#1  ctype_rawchild (ct=0xb7715658, cts=0x8427050) at lj_ctype.h:431
#2 ccall_get_results (L=0x8420008, cts=0x8427050, ct=0xb7715658, cc=0xbff9c1b0, ret=0xbff9c270) at lj_ccall.c:763 #3 0x080b84a6 in lj_ccall_func (L=0x8420008, cd=0x8429fa0) at lj_ccall.c:820
#4  0x0806ecb5 in lj_cf_ffi_meta___call (L=0x8420008) at lib_ffi.c:228
#5  0x0807670f in lj_BC_FUNCC ()
#6 0x08059163 in lua_pcall (L=0x8420008, nargs=0, nresults=-1, errfunc=2) at lj_api.c:1034
#7  0x0804ad56 in docall (L=0x8420008, narg=0, clear=0) at luajit.c:134
#8 0x0804b6d9 in handle_script (L=0x8420008, argv=0xbff9c5b4, n=1) at luajit.c:301
#9  0x0804c288 in pmain (L=0x8420008) at luajit.c:550
#10 0x0807670f in lj_BC_FUNCC ()
#11 0x0805933f in lua_cpcall (L=0x8420008, func=0x804c0e6 <pmain>, ud=0xbff9c4fc) at lj_api.c:1056
#12 0x0804c377 in main (argc=2, argv=0xbff9c5b4) at luajit.c:579


However the following change makes it work fine:

--- test2.lua   2012-08-11 16:44:46.000000000 +0100
+++ test1.lua   2012-08-11 16:48:34.000000000 +0100
@@ -5,8 +5,8 @@

-- load test library
ffi.cdef([[
-typedef void (func)(void);
-void register_cb(func *cb);
+typedef void (*func)(void);
+void register_cb(func cb);
void processevent(void);
]])

@@ -19,12 +19,12 @@
               print("in callback")
               -- do something useful
               cb:free()
-               cb = ffi.cast("func *", lua_cb)
+               cb = ffi.cast("func", lua_cb)
               callbacks.register_cb(cb)
       end

-- start
-cb = ffi.cast("func *", lua_cb)
+cb = ffi.cast("func", lua_cb)
callbacks.register_cb(cb)

while true do

The following is my trivial C callback generator used in the code above to reproduce this. I don't claim this to do anything other them minimally emulate the library I'm using enough to reproduce the problem.... The actual library is Spotify's libspotify which defines its callback typedef as function types not pointers to functions, so its natural to do the cast to "<type> *" using the typedefs defined by the library (as per the problem example). I would suggest that ideally this form should be supported.

#include <stdio.h>

#define SIZE 4096

typedef void (*func)(void);

static func callbacks[SIZE];
static int a = 0;
static int b = 0;

void register_cb(func cb) {
int next = (a + 1) % SIZE;
if (next != b) {
 callbacks[a] = cb;
 a = next;
} else {
 printf("cbtest: callback overflow\n");
}
}

void processevent(void) {
if (a != b) {
 (callbacks[b])();
 b = (b + 1) % SIZE;
}
}

Arch is linux i386.

I'm not sure if cb:free() should be called within the callback itself, but it seems to work - is this advised?

Adrian

Other related posts: