On 27 July 2014 21:51, Paweł Dziepak <pdziepak@xxxxxxxxxxx> wrote: > Interrupts are not a problem here, we are using LL/SC instructions so the > atomic functions are indeed always atomic (even with interrupts enabled and > on multiprocessor systems). > > Currently, atomic_get_and_set() looks like this: > > miss4: > ldrex r12, [r0] > strex r3, r1, [r0] > teq r3, #0 > bne miss4 > bx lr > > The content of r12 is not moved to r0 at the end of the function so the > return value is wrong and the spinlock implementation wrongly assumes that > the lock is already taken. There is obviously also a problem with memory > ordering but that won't be noticeable on single processor system since this > function call acts as compiler barrier. > That's why I think we should implement atomics > like this: > > int32_t atomic_get_and_set(int32_t* ptr, int32_t value) > { > auto& obj = *reinterpret_cast<std::atomic<int32_t>*>(ptr); > return obj.exchange(value); > } > > The produced assembly with -O2 for ARMv7 is very good: > > dmb sy > .L5: > ldrex r3, [r0] > strex r2, r1, [r0] > cmp r2, #0 > bne .L5 > dmb sy > mov r0, r3 > bx lr I've created a ticket(#11121) which has a patch that sets the return value correctly. This makes the code largely similar to the disassembly of the C++ function pdziepak wrote earlier in the thread except that the DMB instructions aren't present. I suppose this would suffice for now?