Re: [PATCH] Tunable LJ_MIN_STRTAB

  • From: Mike Pall <mike-1403@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Tue, 18 Mar 2014 00:49:15 +0100

Yichun Zhang (agentzh) wrote:
> For our web apps creating a lot of short-lived Lua strings, we've
> noticed from our C-land on-CPU Flame Graphs that the lj_str_new is
> taking a lot of time. Further investigation shows that it is because
> the bucket collision rate in the global string hash table is quite
> high due to the relatively small bucket count.
> 
> Increasing LJ_MIN_STRTAB to a larger value, say, 4096, helps reducing
> the overhead of lj_str_new() a lot according to the flame graph.

Umm, you realize the string table automatically resizes? So ... do
you want to say that the resizing doesn't work? Or that the chosen
load factor leads to too many collisions for your use case?

In the latter case, tuning the load factor would make more sense.
Maybe even in general, not just for your application. The load
factor of 100% (*) is hard-coded near the end of lj_str_new().

Maybe you can experiment with this a bit and tell us your findings?
But be careful to use only shifts, adds or multiplies with small
numbers in the formula. Or maybe precompute the next resize step.
Note that if you want to go below 50%, then you'll need to change
gc_onestep() near GCSsweep, too.

(*) Yes, this would be way too high for most hash table
implementations. But it's ok for Brent's variation of chained
hashing. It does work fine for regular Lua tables. OTOH, maybe
it's not an optimal choice for the global string hash table.

--Mike

Other related posts:

  • » Re: [PATCH] Tunable LJ_MIN_STRTAB - Mike Pall