lua traceback during memory allocations

  • From: Greg Greenway <greg@xxxxxxxxxxxxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Fri, 14 Mar 2014 12:57:31 -0700

Hi,

I'm trying to add some tracking of lua allocations to help figure out where
I'm "leaking" memory in my code.  (It's not technically a leak in the
strict sense because something is referencing it).

My basic approach is to add a function call right after the allocation of
strings, tables, functions, etc (in C code) and store a lua backtrace at
that time and store it with the allocated pointer in a hash table.  When
memory is free'd, I remove the entry.  Periodically, I look at what's in
the table and look for a growing number of occurrences of the same
callstack.

My problem is that sometimes while trying to generate the backtrace, I get
a SEGV.  My code for getting a traceback is:

    for (int i = 0; i < 6; i++) {
        int size;
        cTValue *frame = lj_debug_frame(L, i, &size);
        if (frame && frame_islua(frame)) {
            GCfunc *fn = frame_func(frame);
            if (isluafunc(fn)) {
                cTValue *nextframe = frame + size;
                BCLine line = debug_frameline(L, fn, nextframe);
                lj_debug_shortname(buf, proto_chunkname(funcproto(fn)));
                snprintf(fileline, sizeof(fileline), "%s:%d%s", buf, line,
sep);
            } else {
                snprintf(fileline, sizeof(fileline), "unknown%s", sep);
            }
            /* add fileline to the backtrace here */
        } else {
            break;
        }
    }

My crash backtrace is:
#0  0x00000000004bdb4f in debug_framepc (L=<optimized out>, fn=0x400dbbf8,
nextframe=<optimized out>) at src/vendor/luajit/jit/debug.c:91
#1  0x00000000004bda29 in debug_frameline (L=0x41994378, fn=0x400dbbf8,
nextframe=0x0) at src/vendor/luajit/jit/debug.c:125
#2  0x00000000004dd50a in backtrace (L=<optimized out>, this=0x299ef98,
__str=<unknown type in /home/ubuntu/agent/agent, CU 0x26eae2, DIE
0x283cb3>, L=<optimized out>)
   at src/vendor/luajit/lib/memtrack.cpp:37
#3  lua_track_mem (L=0x41994378, ptr=0x412fe270) at
src/vendor/luajit/lib/memtrack.cpp:93
#4  0x00000000004cb5fa in newtab (L=0x41994378, asize=<optimized out>,
hbits=0) at src/vendor/luajit/jit/tab.c:129
#5  0x00000000004cb701 in lj_tab_dup (L=0x41994378, kt=0x41735630) at
src/vendor/luajit/jit/tab.c:169
#6  0x0000000023a0c2b4 in ?? ()
#7  0x0000000040a14e68 in ?? ()
#8  0x0000000000000000 in ?? ()

My guess is that I'm trying to walk up the callstack at a time when it
isn't in a consistent enough state.

Other variations I've tried:
1) I tried using luaL_traceback, both with using the same L to put the
traceback on or allocating a new state (L) to put the traceback on.  In
both cases, I somehow ended up corrupting the stack intermittently such
that some code would fail for things like trying to do a table lookup in a
function (when the lua code clearly had a table).  I moved to my approach
above because it doesn't change the lua stack at all.

2) I originally hooked into the allocator with lua_setallocf() but ended up
crashing much more often because I was trying to generate a traceback when
doing things like growing the lua stack, so I tried to be more selective in
what I was tracking.

3) I wrapped setjmp/longjmp around the code and setup a signal handler for
SIGSEGV and SIGBUS.  This seems to work, but that's just hiding the
problem.

My questions:

1) Is there any better way to track allocations than what I'm doing?  Is
there a completely different approach that would work better?

2) If there's no better option from (1), is there a better way to generate
my tracebacks that won't crash?

Thanks in advance,
greg

Other related posts: