[PATCH] powerpc: Remove static branch prediction in atomic{, 64}_add_unless

Nicholas Piggin npiggin at gmail.com
Tue Oct 4 14:18:35 AEDT 2016


On Mon,  3 Oct 2016 17:03:03 +1100
Anton Blanchard <anton at ozlabs.org> wrote:

> From: Anton Blanchard <anton at samba.org>
> 
> I see quite a lot of static branch mispredictions on a simple
> web serving workload. The issue is in __atomic_add_unless(), called
> from _atomic_dec_and_lock(). There is no obvious common case, so it
> is better to let the hardware predict the branch.

Seems reasonable. How problematic is an unmatched lwarx for performance?
It seems that it will serialize a subsequent lwarx until it is next to
complete (in the case of atomic_dec_and_lock, the spin_lock will
immediately do another lwarx). Maybe that's not a big problem.

Putting a regular load and test here before the lwarx would be
disappointing because many users have success as the common case.


> Signed-off-by: Anton Blanchard <anton at samba.org>
> ---
>  arch/powerpc/include/asm/atomic.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h
> index f08d567..2b90335 100644
> --- a/arch/powerpc/include/asm/atomic.h
> +++ b/arch/powerpc/include/asm/atomic.h
> @@ -233,7 +233,7 @@ static __inline__ int __atomic_add_unless(atomic_t *v, int a, int u)
>  	PPC_ATOMIC_ENTRY_BARRIER
>  "1:	lwarx	%0,0,%1		# __atomic_add_unless\n\
>  	cmpw	0,%0,%3 \n\
> -	beq-	2f \n\
> +	beq	2f \n\
>  	add	%0,%2,%0 \n"
>  	PPC405_ERR77(0,%2)
>  "	stwcx.	%0,0,%1 \n\
> @@ -539,7 +539,7 @@ static __inline__ int atomic64_add_unless(atomic64_t *v, long a, long u)
>  	PPC_ATOMIC_ENTRY_BARRIER
>  "1:	ldarx	%0,0,%1		# __atomic_add_unless\n\
>  	cmpd	0,%0,%3 \n\
> -	beq-	2f \n\
> +	beq	2f \n\
>  	add	%0,%2,%0 \n"
>  "	stdcx.	%0,0,%1 \n\
>  	bne-	1b \n"



More information about the Linuxppc-dev mailing list