Re: Array performance with 2.0.0-beta10 versus git HEAD

  • From: Mike Pall <mike-1208@xxxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Tue, 28 Aug 2012 23:25:03 +0200

Peter Colberg wrote:
> The reason for not unrolling manually is to have dimension-independent
> code, suitable for both two- and three-dimensional systems of particles.

To achieve consistently high performance you'll need to unroll
those small vector operations by hand. See the previous posts
about tuning GSL on the Lua mailing list (it uses a template
pre-processor to automate that).

> I was actually surprised to see that the bound checks have no
> influence on the performance (maybe the above “benchmark” is too
> trivial and flawed...). Is ABCelim in lj_opt_fold.c responsible for
> eliminating such bound checks?

Bounds checks use the integer units of the CPU, whereas the actual
computations use the floating-point units. With a super-scalar
out-of-order CPU, the integer overhead is completely hidden.

--Mike

Other related posts: