[PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

Peter Zijlstra peterz at infradead.org
Fri Oct 9 19:31:38 AEDT 2015


On Thu, Oct 08, 2015 at 02:44:39PM -0700, Paul E. McKenney wrote:
> On Thu, Oct 08, 2015 at 01:16:38PM +0200, Peter Zijlstra wrote:
> > On Thu, Oct 08, 2015 at 02:50:36PM +1100, Michael Ellerman wrote:
> > > On Wed, 2015-10-07 at 08:25 -0700, Paul E. McKenney wrote:
> > 
> > > > Currently, we do need smp_mb__after_unlock_lock() to be after the
> > > > acquisition on PPC -- putting it between the unlock and the lock
> > > > of course doesn't cut it for the cross-thread unlock/lock case.
> > 
> > This ^, that makes me think I don't understand
> > smp_mb__after_unlock_lock.
> > 
> > How is:
> > 
> > 	UNLOCK x
> > 	smp_mb__after_unlock_lock()
> > 	LOCK y
> > 
> > a problem? That's still a full barrier.
> 
> The problem is that I need smp_mb__after_unlock_lock() to give me
> transitivity even if the UNLOCK happened on one CPU and the LOCK
> on another.  For that to work, the smp_mb__after_unlock_lock() needs
> to be either immediately after the acquire (the current choice) or
> immediately before the release (which would also work from a purely
> technical viewpoint, but I much prefer the current choice).
> 
> Or am I missing your point?

So lots of little confusions added up to complete fail :-{

Mostly I think it was the UNLOCK x + LOCK x are fully ordered (where I
forgot: but not against uninvolved CPUs) and RELEASE/ACQUIRE are
transitive (where I forgot: RELEASE/ACQUIRE _chains_ are transitive, but
again not against uninvolved CPUs).

Which leads me to think I would like to suggest alternative rules for
RELEASE/ACQUIRE (to replace those Will suggested; as I think those are
partly responsible for my confusion).

 - RELEASE -> ACQUIRE is fully ordered (but not a full barrier) when
   they operate on the same variable and the ACQUIRE reads from the
   RELEASE. Notable, RELEASE/ACQUIRE are RCpc and lack transitivity.

 - RELEASE -> ACQUIRE can be upgraded to a full barrier (including
   transitivity) using smp_mb__release_acquire(), either before RELEASE
   or after ACQUIRE (but consistently [*]).

 - RELEASE -> ACQUIRE _chains_ (on shared variables) preserve causality,
   (because each link is fully ordered) but are not transitive.

And I think that in the past few weeks we've been using transitive
ambiguously, the definition we have in Documentation/memory-barriers.txt
is a _strong_ transitivity, where we can make guarantees about CPUs not
directly involved.

What we have here (due to RCpc) is a weak form of transitivity, which,
while it preserves the natural concept of causality, does not extend to
other CPUs.

So we could go around and call them 'strong' and 'weak' transitivity,
but I suspect its easier for everyone involved if we come up with
separate terms (less room for error if we accidentally omit the
'strong/weak' qualifier).


[*] Do we want to take that choice away and go for:
smp_mb__after_release_acquire() ?


More information about the Linuxppc-dev mailing list