[pythran] Re: codespeed

  • From: Pierrick Brunet <pierrick.brunet@xxxxxxxx>
  • To: pythran@xxxxxxxxxxxxx
  • Date: Mon, 21 Apr 2014 20:41:12 +0200

On 20/04/2014 21:56, serge Guelton wrote:
On Sun, Apr 20, 2014 at 10:35:06AM +0200, Pierrick Brunet wrote:
Hi pythraner.

As I re-enable the bench functionnality, I am able to fill a
codespeed database (available only on my computer. Sorry...).

Some bench are really interesting.

euler 14: we are only x2 faster than CPython
fibo_seq : we have almost no speed up
harris : only a x2 speed up
loopy_jacob : we have a really huge speed down (almost x50 compare
to CPython)

There are some others case too but I will not points all cases.

Also, I save time for omp version.

It is really funny to see that nqueen is x2 slower with omp than
without. Funny because it doesn't use it.
It is certainly du to a memory alignement issue du to improted
symbole from the ligomp library (This is the interpretation we had
at work for this kind of issue).
Some others tests have the same probleme like : perm, wdist and
I've already noticed this a long time ago. The reason is that shared_ref
uses an std::atomic when OpenMP is enabled.

A very nice solution would be, for each non-scalar type, to have a
version without shared-ref and a version with shared ref, and to use the
version with shared ref only when necessary (as computed by a smart
analysis that decides, whenever there would be a copy, if a ref copy, a
move or a shared is necessary)

That could lead to a paper!

It is point 42 in my todo list :p but I had no idea that it was the cause of this issue. I didn't think taht atomic may lead to this perf issue when there are no parallelism... But it looks to be a better explanation.

Other related posts: