[PATCH v2] barriers: introduce smp_mb__release_acquire and update documentation

Boqun Feng boqun.feng at gmail.com
Mon Oct 19 12:17:18 AEDT 2015


On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote:
> On Fri, Oct 09, 2015 at 10:31:38AM +0200, Peter Zijlstra wrote:
[snip]
> > 
> > So lots of little confusions added up to complete fail :-{
> > 
> > Mostly I think it was the UNLOCK x + LOCK x are fully ordered (where I
> > forgot: but not against uninvolved CPUs) and RELEASE/ACQUIRE are
> > transitive (where I forgot: RELEASE/ACQUIRE _chains_ are transitive, but
> > again not against uninvolved CPUs).
> > 
> > Which leads me to think I would like to suggest alternative rules for
> > RELEASE/ACQUIRE (to replace those Will suggested; as I think those are
> > partly responsible for my confusion).
> 
> Yeah, sorry. I originally used the phrase "fully ordered" but changed it
> to "full barrier", which has stronger transitivity (newly understood
> definition) requirements that I didn't intend.
> 
> RELEASE -> ACQUIRE should be used for message passing between two CPUs
> and not have ordering effects on other observers unless they're part of
> the RELEASE -> ACQUIRE chain.
> 
> >  - RELEASE -> ACQUIRE is fully ordered (but not a full barrier) when
> >    they operate on the same variable and the ACQUIRE reads from the
> >    RELEASE. Notable, RELEASE/ACQUIRE are RCpc and lack transitivity.
> 
> Are we explicit about the difference between "fully ordered" and "full
> barrier" somewhere else, because this looks like it will confuse people.
> 

This is confusing me right now. ;-)

Let's use a simple example for only one primitive, as I understand it,
if we say a primitive A is "fully ordered", we actually mean:

1.	The memory operations preceding(in program order) A can't be
	reordered after the memory operations following(in PO) A.

and

2.	The memory operation(s) in A can't be reordered before the
	memory operations preceding(in PO) A and after the memory
	operations following(in PO) A.

If we say A is a "full barrier", we actually means:

1.	The memory operations preceding(in program order) A can't be
	reordered after the memory operations following(in PO) A.

and

2.	The memory ordering guarantee in #1 is visible globally.

Is that correct? Or "full barrier" is more strong than I understand,
i.e. there is a third property of "full barrier":

3.	The memory operation(s) in A can't be reordered before the
	memory operations preceding(in PO) A and after the memory
	operations following(in PO) A.

IOW, is "full barrier" a more strong version of "fully ordered" or not?

Regards,
Boqun

> >  - RELEASE -> ACQUIRE can be upgraded to a full barrier (including
> >    transitivity) using smp_mb__release_acquire(), either before RELEASE
> >    or after ACQUIRE (but consistently [*]).
> 
> Hmm, but we don't actually need this for RELEASE -> ACQUIRE, afaict. This
> is just needed for UNLOCK -> LOCK, and is exactly what RCU is currently
> using (for PPC only).
> 
> Stepping back a second, I believe that there are three cases:
> 
> 
>  RELEASE X -> ACQUIRE Y (same CPU)
>    * Needs a barrier on TSO architectures for full ordering
> 
>  UNLOCK X -> LOCK Y (same CPU)
>    * Needs a barrier on PPC for full ordering
> 
>  RELEASE X -> ACQUIRE X (different CPUs)
>  UNLOCK X -> ACQUIRE X (different CPUs)
>    * Fully ordered everywhere...
>    * ... but needs a barrier on PPC to become a full barrier
> 
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20151019/39c4d144/attachment.sig>


More information about the Linuxppc-dev mailing list