Re: Compiler load/store barrier; volatile pointer; barriers in general

  • From: Luke Gorrie <luke@xxxxxxxx>
  • To: luajit@xxxxxxxxxxxxx
  • Date: Thu, 29 Jan 2015 11:01:53 +0100

On 28 January 2015 at 19:12, Mike Pall <mike-1501@xxxxxxxxxx> wrote:

> Luke Gorrie wrote:
> > I suppose that a really safe and future proof way to load/store volatile
> > values would be with a little C library:
> >
> > int peek(int *ptr) { return *ptr; }
> > void poke(int *ptr, int value) { *ptr = value; }
> >
> > and then call that via FFI?
>
> That's overkill. A regular C call is 'safe' in the sense that its
> semantics ('may have any side-effect on memory') will never change.
>

OK, how about this then for a simple solution today for use outside of
inner loops?

void compiler_barrier() {}
void cpu_barrier() { __sync_synchronize(); } // MFENCE on x86

and we expect that the JIT will not forward loads or stores across either
barrier.


> In the future, there might be some types of C functions that give
> the compiler more freedom for cross-call optimizations. But then
> their declarations would have to be excplicitly tagged with
> attributes.
>
> > Can loads be forwarded between loop iterations? That is: could load
> > forwarding create an infinite loop if I am polling for a new value in an
> > FFI pointer (while ptr[0] == nil do end)?
>
> Sure, it will. Just try it.
>
> - Create a new ffi.barrier() builtin for the interpreter.


Thanks for spelling out these steps!

That said, maybe one should introduce a more general builtin (not
> sure how to name this), that allows a wider range of interesting
> optimizations. E.g. telling the compiler that the result of a load
> is definitely constant.
>

I'm not sure if this is related but I have also had fantasies about being
able to translate constant-yielding expressions into actual constants in
the recording step. Like a "memoize this call site" primitive. I am not
sure how widely applicable this would be though, or how prone to misuse.

I've previously (*) mentioned the idea of user-definable
> intrinsics for the FFI.


That looks like an awesome feature on the face of it. I wonder in how many
cases it would actually be preferable to calling a C function that uses
GCC's existing intrinsics. Just when the overhead of a CALL and compiler
barrier is excessive?

Cheers,
-Luke

Other related posts: