Re: Array performance with 2.0.0-beta10 versus git HEAD

From: Mike Pall <mike-1208@xxxxxxxxxx>
To: luajit@xxxxxxxxxxxxx
Date: Tue, 28 Aug 2012 23:25:03 +0200

Peter Colberg wrote:
> The reason for not unrolling manually is to have dimension-independent
> code, suitable for both two- and three-dimensional systems of particles.

To achieve consistently high performance you'll need to unroll
those small vector operations by hand. See the previous posts
about tuning GSL on the Lua mailing list (it uses a template
pre-processor to automate that).

> I was actually surprised to see that the bound checks have no
> influence on the performance (maybe the above “benchmark” is too
> trivial and flawed...). Is ABCelim in lj_opt_fold.c responsible for
> eliminating such bound checks?

Bounds checks use the integer units of the CPU, whereas the actual
computations use the floating-point units. With a super-scalar
out-of-order CPU, the integer overhead is completely hidden.

--Mike

References:
- Array performance with 2.0.0-beta10 versus git HEAD
  - From: Peter Colberg
- Re: Array performance with 2.0.0-beta10 versus git HEAD
  - From: Mike Pall
- Re: Array performance with 2.0.0-beta10 versus git HEAD
  - From: Peter Colberg

Re: Array performance with 2.0.0-beta10 versus git HEAD

Other related posts: