[RFC 3/5] powerpc: atomic: implement atomic{,64}_{add,sub}_return_* variants

Peter Zijlstra peterz at infradead.org
Sat Aug 29 01:39:21 AEST 2015


On Fri, Aug 28, 2015 at 10:16:02PM +0800, Boqun Feng wrote:
> On Fri, Aug 28, 2015 at 08:06:14PM +0800, Boqun Feng wrote:
> > Hi Peter,
> > 
> > On Fri, Aug 28, 2015 at 12:48:54PM +0200, Peter Zijlstra wrote:
> > > On Fri, Aug 28, 2015 at 10:48:17AM +0800, Boqun Feng wrote:
> > > > +/*
> > > > + * Since {add,sub}_return_relaxed and xchg_relaxed are implemented with
> > > > + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
> > > > + * on the platform without lwsync.
> > > > + */
> > > > +#ifdef CONFIG_SMP
> > > > +#define smp_acquire_barrier__after_atomic() \
> > > > +	__asm__ __volatile__(PPC_ACQUIRE_BARRIER : : : "memory")
> > > > +#else
> > > > +#define smp_acquire_barrier__after_atomic() barrier()
> > > > +#endif
> > > > +#define arch_atomic_op_acquire(op, args...)				\
> > > > +({									\
> > > > +	typeof(op##_relaxed(args)) __ret  = op##_relaxed(args);		\
> > > > +	smp_acquire_barrier__after_atomic();				\
> > > > +	__ret;								\
> > > > +})
> > > > +
> > > > +#define arch_atomic_op_release(op, args...)				\
> > > > +({									\
> > > > +	smp_lwsync();							\
> > > > +	op##_relaxed(args);						\
> > > > +})
> > > 
> > > Urgh, so this is RCpc. We were trying to get rid of that if possible.
> > > Lets wait until that's settled before introducing more of it.
> > > 
> > > lkml.kernel.org/r/20150820155604.GB24100 at arm.com
> > 
> > OK, get it. Thanks.
> > 
> > So I'm not going to introduce these arch specific macros, I think what I
> > need to implement are just _relaxed variants and cmpxchg_acquire.
> 
> Ah.. just read through the thread you mentioned, I might misunderstand
> you, probably because I didn't understand RCpc well..
> 
> You are saying that in a RELEASE we -might- switch from smp_lwsync() to
> smp_mb() semantically, right? I guess this means we -might- switch from
> RCpc to RCsc, right?
> 
> If so, I think I'd better to wait until we have a conclusion for this.

Yes, the difference between RCpc and RCsc is in the meaning of RELEASE +
ACQUIRE. With RCsc that implies a full memory barrier, with RCpc it does
not.

Currently PowerPC is the only arch that (can, and) does RCpc and gives a
weaker RELEASE + ACQUIRE. Only the CPU who did the ACQUIRE is guaranteed
to see the stores of the CPU which did the RELEASE in order.

As it stands, RCU is the only _known_ codebase where this matters, but
we did in fact write code for a fair number of years 'assuming' RELEASE
+ ACQUIRE was a full barrier, so who knows what else is out there.


RCsc - release consistency sequential consistency
RCpc - release consistency processor consistency

https://en.wikipedia.org/wiki/Processor_consistency (where they have
s/sequential/causal/)


More information about the Linuxppc-dev mailing list