Segfault from nil table access when using ffi.gc

  • From: Chris Osborne <chris.osborne@xxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Fri, 11 Nov 2016 21:35:00 +0000

Hi,

I seem to be encountering a bug in luajit, tested on version 2.0.4 on both Linux x64 (Mint 17.3) and OS X 10.10 (from the apt-get default package/homebrew release). If I have a pointer active that is subject the ffi.gc using my own finalizer (rather than just malloc/free) I get a segfault if I access a nil element of a table rather than a nice traceback telling me off. I've included a minimum working example below:

test.cpp:

#include <cstdlib>

extern "C"
{
typedef struct
{
    int len;
    double* a;
} Abc;
}


extern "C"
Abc* Abc_new(int length)
{
    Abc* a = (Abc*)malloc(sizeof(Abc));
    a->a = (double*)malloc(length * sizeof(double));
    a->len = length;
    return a;
}

extern "C"
void Abc_destruct(Abc* a)
{
    free(a->a);
    free(a);
}
--------------------------------------------------------------------------------------
compile with `c++ -fpic -shared -o libtest.so test.cpp`
--------------------------------------------------------------------------------------
seg.lua:

local blah = nil
local ffi = require('ffi')
ffi.cdef[[
typedef struct
{
    int len;
    double* a;
} Abc;

Abc* Abc_new(int length);
void Abc_destruct(Abc* a);
]]

local test = ffi.load('./libtest.so')
local cells = ffi.gc(test.Abc_new(100), test.Abc_destruct)
blah:this_will_segfault()
-------------------------------------------------------------------------------------

If I call blah:this_will_segfault() before the ffi.gc call, it gives me the traceback that I'd expect, but putting it there gives a segfault. Valgrind seems to link this segfault to lua_gc:

==22498== Memcheck, a memory error detector
==22498== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==22498== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==22498== Command: luajit seg.lua
==22498==
==22498== Jump to the invalid address stated on the next line
==22498==    at 0x5D26754: ???
==22498==    by 0x43BE45: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x43C51F: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x454585: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x40E779: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x42B877: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x42B916: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x448E37: lua_gc (in /usr/bin/luajit-2.0.4)
==22498==    by 0x403EE8: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x404CAD: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x454585: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x448B79: lua_cpcall (in /usr/bin/luajit-2.0.4)
==22498==  Address 0x5d26754 is not stack'd, malloc'd or (recently) free'd
==22498==
==22498==
==22498== Process terminating with default action of signal 11 (SIGSEGV)
==22498==  Access not within mapped region at address 0x5D26754
==22498==    at 0x5D26754: ???
==22498==    by 0x43BE45: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x43C51F: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x454585: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x40E779: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x42B877: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x42B916: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x448E37: lua_gc (in /usr/bin/luajit-2.0.4)
==22498==    by 0x403EE8: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x404CAD: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x454585: ??? (in /usr/bin/luajit-2.0.4)
==22498==    by 0x448B79: lua_cpcall (in /usr/bin/luajit-2.0.4)
==22498==  If you believe this happened as a result of a stack
==22498==  overflow in your program's main thread (unlikely but
==22498==  possible), you can try to increase the size of the
==22498==  main thread stack using the --main-stacksize= flag.
==22498==  The main thread stack size used in this run was 8388608.
==22498==
==22498== HEAP SUMMARY:
==22498==     in use at exit: 816 bytes in 2 blocks
==22498==   total heap usage: 9 allocs, 7 frees, 6,927 bytes allocated
==22498==
==22498== LEAK SUMMARY:
==22498==    definitely lost: 0 bytes in 0 blocks
==22498==    indirectly lost: 0 bytes in 0 blocks
==22498==      possibly lost: 0 bytes in 0 blocks
==22498==    still reachable: 816 bytes in 2 blocks
==22498==         suppressed: 0 bytes in 0 blocks
==22498== Rerun with --leak-check=full to see details of leaked memory
==22498==
==22498== For counts of detected and suppressed errors, rerun with: -v
==22498== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault


Any advice as to whether I'm doing something wrong, or if this is a bug in luajit (known or otherwise) would be welcome -- it makes silly typo errors take a lot longer to track down!

Many thanks!

Other related posts: