[haiku-commits] Re: BRANCH pdziepak-github.lock_elision [523c8e8] src/system/libroot/os/arch/x86 src/system/libroot/os/locks src/system/libroot/os/arch/x86_64 headers/private/system src/system/libroot/os/arch/ppc

  • From: Pawel Dziepak <pdziepak@xxxxxxxxxxx>
  • To: haiku-commits@xxxxxxxxxxxxx
  • Date: Thu, 22 Aug 2013 22:26:49 +0200

2013/8/16 Ingo Weinhold <ingo_weinhold@xxxxxx>:
> On 08/15/2013 08:50 PM, Pawel Dziepak wrote:
>>
>> 2013/8/14 Ingo Weinhold <ingo_weinhold@xxxxxx>:
>>
>>> I don't think doing that unconditionally is a good idea. These are
>>> general
>>> purpose locking functions and a critical section may contain any code. It
>>> may be too long for transactional execution or it may contain an
>>> instruction
>>> that causes the transaction to abort. In such a case we'd waste cycles
>>> unnecessarily to execute the code optimistically first.
>>>
>>> A good solution may be to introduce explicit
>>> mutex_[un]lock_with_elision()
>>> functions, so developers can decide depending on the critical section
>>> whether lock elision should be attempted.
>>
>>
>> I am afraid something like mutex_[un]lock_with_elision() would result
>> in lock elision almost unused.
>
>
> I think it should be used with consideration, since otherwise it is more
> likely to hurt performance than to improve it. To ensure that it is used at
> all, existing code that uses the locking primitives should be reviewed and
> the feature used where it looks promising. Yes, that is quite a bit of work
> (well, probably not that much in userland code, I think).

Even if we choose to extend our locking API with
mutex_[un]lock_with_elision() (and equivalents for other locking
primitives) we cannot do the same for pthread mutexes. In this case,
since we are limited by the current API, I think the best soultion
would be to use some adaptive algorithm to decide whether to try lock
elision.

>> I think the best solution
>> would be to make lock implementation to dynamically decide whether to
>> try lock elision or not (using collected information how many times
>> transaction was aborted, what was the reason on these aborts, etc), so
>> that it will be possible to take most of the CPU abilities. In case of
>> using instructions that always cause aborts inside critical sections
>> there I think mutex_init_etc() flags may be used to force disabling
>> (or enabling) lock elision in the particular lock.
>
>
> Since it's the properties of the critical section that matter and there can
> be multiple critical sections per lock, I still think dedicated
> mutex_[un]lock_with_elision() are the way to go. That can be combined with
> the statistical analysis you suggest. We may want to introduce separate
> structures for locking primitives with lock elision support, so the extra
> memory for the statistical data can be saved where not needed.

In most cases there wouldn't be much point in using both
mutex_[un]lock_with_elision() and "traditional" mutex_[un]lock() on
the same lock. All in-progress transactions have to be aborted once
another thread acquires the mutex. Locks which critical sections that
cause aborts are entered very rarely and there are often entered
critical sections that should use lock elision would indeed benefit
from using lock elision only at some critical sections. However, in
all other scenarios it would be better stick to one behavior in all
critical sections of a given lock.

Intel was nice enough to give us the reason of abort (whether it was
cause by a conflict with another CPU, transaction buffer overflow,
etc) what an algorithm deciding whether to try lock elision would
definitely use. Aborts cause by data conflict would result in much
smaller penalties than aborts caused by, e.g., buffer overflow, but
still when a critical section is frequently aborted due to data
conflict (something not as easy to predict as instructions that cause
aborts) the algorithm may temporarily disable lock elision.

Paweł

Other related posts: