Re: Using FFI for large string buffers and a suggestion

  • From: Mike Pall <mike-1206@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Wed, 27 Jun 2012 13:39:19 +0200

QuentinC wrote:
> FFI is the winner, is that intended or is it just accidental ?

The naive approach is well known to have quadratic complexity, due
to string interning.

Although the FFI solution and table.concat use a similar approach,
the FFI solution is faster here. table.concat does way too many
memory copies and temporary string interning for this test case.

table.concat uses the luaL_Buffer API, which is known to be rather
inefficient for certain use cases. That's one piece of code taken
unmodified from the Lua interpreter. It needs to be replaced with
something better. That's already on my TODO list, but I'm not sure
when I'll get around to it (probably in 2.1).

> Why not adding a function ffi.renew to enlarge an already existing
> VLA/VLS using realloc internally ?

That doesn't work, because the buffer is part of the object.
Enlarging the buffer may require changing the address and a buffer
copy. But objects cannot change their address.

You can create and grow a buffer with realloc and use that instead.
You need to free it when finished, of course (or use ffi.gc).

> What about returning the number of bytes copied by ffi.copy when no
> explicit length is given ?

Well, that's just as arbitrary as returning the destination buffer
address from memcpy. Or the end of the destination buffer etc. I'd
rather leave it up to the user what auxiliary information to keep,
instead of trying to optimize unused return values away.

> for i=1,n do
> l = tostring(i)
> if p+#l+1 > c then -- need buffer space
>   c = math.ceil(c * 3/2 +1)
>   local ns = ffi.new('char[?]', c)
>   ffi.copy(ns, s, p)
>   s = ns
> end
> ffi.copy(s+p, l)
> p = p + #l
> end

The buffer growth logic is wrong. It doesn't account for the case
where l is bigger than c/2. And doubling the buffer is simpler, too:

  if p+(#l+1) > c then
    c = math.max(c+c, p+(#l+1))
  [...]

--Mike

Other related posts: