[patch] mutex: optimise generic mutex implementations
David Howells
dhowells at redhat.com
Thu Oct 23 03:24:28 EST 2008
Nick Piggin <npiggin at suse.de> wrote:
> Speed up generic mutex implementations.
>
> - atomic operations which both modify the variable and return something imply
> full smp memory barriers before and after the memory operations involved
> (failing atomic_cmpxchg, atomic_add_unless, etc don't imply a barrier because
> they don't modify the target). See Documentation/atomic_ops.txt.
> So remove extra barriers and branches.
>
> - All architectures support atomic_cmpxchg. This has no relation to
> __HAVE_ARCH_CMPXCHG. We can just take the atomic_cmpxchg path unconditionally
>
> This reduces a simple single threaded fastpath lock+unlock test from 590 cycles
> to 203 cycles on a ppc970 system.
>
> Signed-off-by: Nick Piggin <npiggin at suse.de>
This seems to work on FRV which uses the mutex-dec generic algorithm, though
you have to take that with a pinch of salt as I don't have SMP hardware for
it.
Acked-by: David Howells <dhowells at redhat.com>
More information about the Linuxppc-dev
mailing list