[patch 1/2] powerpc: optimise smp_mb
Nick Piggin
npiggin at suse.de
Fri Feb 20 04:12:29 EST 2009
Using lwsync, isync sequence in a microbenchmark is 5 times faster on my G5 than
using sync for smp_mb. Although it takes more instructions.
Running tbench with 4 clients on my 4 core G5 (20 times) gives the
following:
unpatched AVG=920.33 STD=2.36
patched AVG=921.27 STD=2.77
So not a big improvement here, actually it could even be in the noise.
But other workloads or systems might see a bigger win, and the patch
maybe is interesting or could be improved, so I'll ask for comments.
---
Index: linux-2.6/arch/powerpc/include/asm/system.h
===================================================================
--- linux-2.6.orig/arch/powerpc/include/asm/system.h 2009-02-20 01:51:24.000000000 +1100
+++ linux-2.6/arch/powerpc/include/asm/system.h 2009-02-20 02:09:41.000000000 +1100
@@ -52,7 +52,16 @@
# define SMPWMB eieio
#endif
+#ifdef __powerpc64__
+#define smp_mb() __asm__ __volatile__ ( \
+ "1: lwsync \n" \
+ " cmpw 0,%%r0,%%r0 \n" \
+ " bne- 1b \n" \
+ " isync \n" \
+ : : : "memory")
+#else
#define smp_mb() mb()
+#endif
#define smp_rmb() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : :"memory")
#define smp_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : :"memory")
#define smp_read_barrier_depends() read_barrier_depends()
More information about the Linuxppc-dev
mailing list