[RFC 3/5] powerpc: atomic: implement atomic{,64}_{add,sub}_return_* variants

Paul E. McKenney paulmck at linux.vnet.ibm.com
Tue Sep 15 02:26:40 AEST 2015


On Mon, Sep 14, 2015 at 04:38:48PM +0100, Will Deacon wrote:
> On Mon, Sep 14, 2015 at 01:11:56PM +0100, Peter Zijlstra wrote:
> > On Mon, Sep 14, 2015 at 02:01:53PM +0200, Peter Zijlstra wrote:
> > > The scenario is:
> > > 
> > > 	CPU0			CPU1
> > > 
> > > 				unlock(x)
> > > 				  smp_store_release(&x->lock, 0);
> > > 
> > > 	unlock(y)
> > > 	  smp_store_release(&next->lock, 1); /* next == &y */
> > > 
> > > 				lock(y)
> > > 				  while (!(smp_load_acquire(&y->lock))
> > > 					cpu_relax();
> > > 
> > > 
> > > Where the lock does _NOT_ issue a store to acquire the lock at all. Now
> > > I don't think any of our current primitives manage this, so we should be
> > > good, but it might just be possible.
> > 
> > So with a bit more through this seems fundamentally impossible, you
> > always needs some stores in a lock() implementation, the above for
> > instance needs to queue itself, otherwise CPU0 will not be able to find
> > it etc..
> 
> Which brings us back round to separating LOCK/UNLOCK from ACQUIRE/RELEASE.

I believe that we do need to do this, unless we decide to have unlock-lock
continue to imply only acquire and release, rather than full ordering.
I believe that Mike Ellerman is working up additional benchmarking
on this.

							Thanx, Paul

> If we say that UNLOCK(foo) -> LOCK(bar) is ordered but RELEASE(baz) ->
> ACQUIRE(boz) is only ordered by smp_mb__release_acquire(), then I think
> we're in a position where we can at least build arbitrary locks portably
> out of ACQUIRE/RELEASE operations, even though I don't see any users of
> that macro in the imminent future.
> 
> I'll have a crack at some documentation.
> 
> Will
> 



More information about the Linuxppc-dev mailing list