Re: Luajit string concatenation performance

From: Mike Pall <mike-1305@xxxxxxxxxx>
To: luajit@xxxxxxxxxxxxx
Date: Thu, 23 May 2013 00:42:30 +0200

Szabó Antal wrote:
> I don't really know why is the difference, but it's quite significant.
> Because LuaJIT's interpreter is mostly written in assembly (if I know
> correctly),

The interpreter doesn't matter here. All of the time is spent in
C library code.

> I can only think it's the standard library that's faster
> in MSVC compared to GCC (maybe the main bottleneck is memcpy() based
> on your fix, but then it should affect plain Lua too).

It's a mix of memcpy() and malloc()/realloc() performance. Both
depend on libc implementations. The allocator may use different
strategies and employ different kernel calls. Alignment and order
of memory blocks makes a difference, too.

On Linux/x64, Lua 5.2 is much slower than Lua 5.1 here, although
the relevant C code in both VMs is almost the same. The difference
is entirely due to the order of memory blocks. That causes glibc's
memory allocator to employ different resizing strategies.

Anyway, the test has an atypical profile, so that doesn't really
help with tuning anything. As has been pointed out, the solution
is to use appropriate data structures that avoid the quadratic
behavior in the first place. This particular pitfall is well
known, too:
  http://www.luafaq.org/#T1.9
  http://www.lua.org/pil/11.6.html

--Mike

References:
- VS2010 and stdint.h
  - From: Jeff Slutter
- Re: VS2010 and stdint.h
  - From: Mike Pall
- Re: VS2010 and stdint.h
  - From: Coda Highland
- Re: VS2010 and stdint.h
  - From: Mike Pall
- Luajit string concatenation performance
  - From: madigest i
- Re: Luajit string concatenation performance
  - From: Geoff Leyland
- Re: Luajit string concatenation performance
  - From: Szabó Antal
- Re: Luajit string concatenation performance
  - From: Mike Pall
- Re: Luajit string concatenation performance
  - From: Szabó Antal
- Re: Luajit string concatenation performance
  - From: Szabó Antal

Re: Luajit string concatenation performance

Other related posts: