Re: Allocation sinking limits

  • From: Mike Pall <mikelj-1508@xxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Sat, 1 Aug 2015 18:26:45 +0200

Vyacheslav Egorov wrote:

I have updated the gist: https://gist.github.com/mraleph/7dcfbef44a02ec72712e

[though the hackiness of the patch remains on the same level, e.g. it
takes aliasing between keys but not tables into accounts - so it might
still get confused by some unrelated store]

You can do better:

First, this is similar in style to lj_opt_fwd_href_nokey(). Check
where this is used and find the rule LJFOLD(HREF TNEW any) in
lj_opt_fold.c. That's where you'd want to anchor such an
optimization.

But ... think hard about cost vs. benefit. Or the common case vs.
the rare complicated case (which probably doesn't pay off to
optimize).

All of these loops you see in lj_opt_mem.c use very few iterations
in practice. That's because the exclusion criteria and the search
is tuned to reach a conclusion as early as possible. This is
similar to CSE, which is commonly considered to be expensive. But
in LuaJIT it's dirt cheap since it iterates only between 0 and 1
times on average. That's why the compiler can afford to shove
almost the complete IR through CSE at the end of the pipeline.

But way more important than compile time cost is the meta
question: does the optimization you're trying to add have the
right TCO vs. benefit ratio?

The cost depends not just on the compile time overhead, but also
the maintenance cost of adding a new feature. The benefit depends
on how common the code pattern is that you're trying to optimize
and how much runtime cost the optimization saves.

Ok, so let's do a quick analysis for this case:

I'd say the code pattern is quite rare. The pattern can be caught
by adding two entries into the FOLD matching table. Since that's a
semi-perfect hash, the additional cost is zero. The code pattern
is rare, so the cost of the aliasing search doesn't really matter.
Also, if the FOLD matches, there's a high probability the aliasing
search is successful. Then the runtime savings will likely recover
the extra compile time costs.

Of course, the code complexity goes up, but it's copy'n'paste from
a similar optimization. You can likely get it right the first time
and writing the tests is straightforward. The maintenance cost of
adding that feature is probably low.

Next, on to the benefits: the code pattern is rare, but the
savings are good -- it avoids an allocation. Uh oh .. that's the
troublesome case for any analysis, because the overall benefit is
a bit unclear.

Well ... you decide!

Never forget that LuaJIT has quite advanced optimizations, but
it's still a JIT compiler. And the small code footprint is a
selling point, too. Since the low-hanging fruit have all been
picked, every new optimization has to be carefully evaluated.


Anyway, that's probably not the answer you expected. :-) Instead
of either reworking or dropping your patch, I thought it would be
helpful to go on a tangent and explain the reasoning I'd follow,
but leave the ultimate decision in your hands.

--Mike

Other related posts: