Re: Apparent optimiser bug breaks compiled code (2.1.0-alpha)

  • From: Alexander Gall <alexander.gall@xxxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Tue, 7 Oct 2014 13:41:34 +0200

On Tue, Sep 30, 2014 at 2:34 PM, Alexander Gall
<alexander.gall@xxxxxxxxx> wrote:
>
> The aliasing rule has indeed escaped me so far. This probably explains the 
> other issues I've been having and did not understand.
>

I'm still struggling with this :(  The mistake I've made in the
trivialized example from my first posting was painfully obvious, but
the situation in my actual application seems to be substantially
different after all.  But I've been staring at this for so long now
that I may well be blind to the obvious there as well.

I have attached a self-contained example that includes the basic Bloom
filter code from my application.  I can't see a problem related to
aliasing here, but I'm prepard to be corrected just about anything I'm
saying, since I'm starting to go slightly mad over this thing :) The
test.lua script simply stores the same 6-byte value in the filter in a
loop. In the current state, the calculation of the hash of the input
data starts to fail when the code is optimizied, but strangely enough
only if the profiler is enabled with the 'l' option as well:

$ luajit  test.lua
Bad hash counts: h1 0, h2 0
$ luajit -jp=l test.lua
Bit #1 mismatch, expected 212, got 296
Bit #2 mismatch, expected 195, got 253
Bit #3 mismatch, expected 178, got 210
Bit #4 mismatch, expected 161, got 167
Bad hash counts: h1 83, h2 83
[No samples collected]

This may or may not be the exact effect that plagues my application,
but I sure would like to understand what I'm doing wrong here. One
thing I've tried was turning off GC to make sure I'm not making a
blunder with some object being collected early. It doesn't appear so.

A remark about the Bloom filter code: it contains loops with a low
number of iterations (4 in this example), which is fixed and known
when the filter is created. The code uses an automated "loop unroller"
to, well, unroll these loops. This hack has sped up processing
considerably (avoidance of trace aborts due to loop unrolling by the
compiler and improved optimizaton to eliminate GC overhead). I'd be
interested to learn whether that's actually a good thing to do or has
any drawbacks (like causing the effect I'm seeing here ;)

-- 
Alex

Attachment: bloom-bug.tar
Description: Unix tar archive

Other related posts: