Re: Projects I may be willing to sponsor (Large Block GC Bypass Resolved)

  • From: Joe Ellsworth <joexdobs@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Wed, 10 Dec 2014 17:41:50 -0800

I agree with the issues of scaling the Java GC to large memory. I am dealing with Java GC issues in an elastic search project this week.


Your approach will resolve my issue and provides just what I needed. It seems like the best of both worlds, The ability to control large block allocation like I do in optimized C while retaining a GC handy to use the rest of the time. This is a huge benefit because it also allows me to choose when to pay the GC overhead. You have an elegant answer to a very thorny problem.

I recommend some new pages as

 * "Manage Unlimited Memory  Using Luajit Today"
 * "How to bypass GC Limits when using Luajit for Big Data projects"
 * "How Luajit bypasses Java GC hell in Big data projects".

   This is a feature the Lutjit team really should promote  and explain
   in detail.  It makes the language even more attractive to for
   implementing complex big data projects.   I don't think many
   engineers in the Big Data know about the capability but many I meet
   are facing GC issues.  A common solution is for them to port to C or
   to scale out in a complex distributed architecture.   Your solution
   is a nice intermediate step that would be lower cost than porting to C.

Thanks for all the help. This is by far the best set of responses I have received from any user-group in years.


On 12/10/2014 4:15 PM, Pierre-Yves Gérardy wrote:
On Thu, Dec 11, 2014 at 12:27 AM, Mike Pall <mike-1412@xxxxxxxxxx> wrote:
Joe Ellsworth wrote:
What I really want is a patch that will allow me to use 40 gig of
FFI allocations on a server that contains 80 Gig of Ram.
local x = ffi.cast("double *", ffi.C.malloc(40*2LL^30))

But, yes ... you need to manually manage the memory. There's
simply no way to get reliable, performant and automatic memory
management under these constraints.

[Ask the Java people about the horrendous things they need to do
to work around the 'best of their class' garbage collectors for
huge memory loads.]
If I understand correctly, Joe is using large objects. He may malloc
the large array, and reference them in managed ffi structs that use
the __gc metamethod to clean things up.

Here's what I use for buffers.

     local BUFF_INIT_SIZE = 16
     local charsize = ffi.sizeof"char"
     ffi.cdef"void* malloc (size_t size);"
     ffi.cdef"void free (void* ptr);"

     local Buffer = ffi.metatype(
         --               size,       index,            array
         "struct{uint32_t s; uint32_t i; unsigned char* a;}",
         {__gc = function(self) ffi.C.free(self.a) end}
     )

     local function make_buffer()
         local b = Buffer(
             BUFF_INIT_SIZE,
             0,
             ffi.C.malloc(BUFF_INIT_SIZE * charsize)
         )
         return b
     end

     local function reserve (buf, size) -- ensure the buffer can
accomodate `size` bytes.
         if size <= buf.s then return end
         repeat buf.s = buf.s * 2 until size <= buf.s end
         local a = C.malloc(buf.s * charsize)
         if a == nil then error("out of memory") end
         ffi.copy(a, buf.a, buf.i)
         ffi.C.free(buf.a)
         buf.a = a
     end

—Pierre-Yves


--Mike


Other related posts:

  • » Re: Projects I may be willing to sponsor (Large Block GC Bypass Resolved) - Joe Ellsworth