perf events ring buffer memory barrier on powerpc

Paul E. McKenney paulmck at linux.vnet.ibm.com
Mon Nov 4 21:00:43 EST 2013


On Mon, Nov 04, 2013 at 10:07:44AM +0100, Peter Zijlstra wrote:
> On Sat, Nov 02, 2013 at 08:20:48AM -0700, Paul E. McKenney wrote:
> > On Fri, Nov 01, 2013 at 11:30:17AM +0100, Peter Zijlstra wrote:
> > > Furthermore there's a gazillion parallel userspace programs.
> > 
> > Most of which have very unaggressive concurrency designs.
> 
> pthread_mutex_t A, B;
> 
> char data_A[x];
> int  counter_B = 1;
> 
> void funA(void)
> {
> 	pthread_mutex_lock(&A);
> 	memset(data_A, 0, sizeof(data_A));
> 	pthread_mutex_unlock(&A);
> }
> 
> void funB(void)
> {
> 	pthread_mutex_lock(&B);
> 	counter_B++;
> 	pthread_mutex_unlock(&B);
> }
> 
> void funC(void)
> {
> 	pthread_mutex_lock(&B)
> 	printf("%d\n", counter_B);
> 	pthread_mutex_unlock(&B);
> }
> 
> Then run: funA, funB, funC concurrently, and end with a funC.
> 
> Then explain to userman than his unaggressive program can return:
> 0
> 1
> 
> Because the memset() thought it might be a cute idea to overwrite
> counter_B and fix it up 'later'. Which if I understood you right is
> valid in C/C++ :-(
> 
> Not that any actual memset implementation exhibiting this trait wouldn't
> be shot on the spot.

Even without such a malicious memcpy() implementation I must still explain
about false sharing when the developer notices that the unaggressive
program isn't running as fast as expected.

> > > > By marking "ptr" as atomic, thus telling the compiler not to mess with it.
> > > > And thus requiring that all accesses to it be decorated, which in the
> > > > case of RCU could be buried in the RCU accessors.
> > > 
> > > This seems contradictory; marking it atomic would look like:
> > > 
> > > struct foo {
> > > 	unsigned long value;
> > > 	__atomic void *ptr;
> > > 	unsigned long value1;
> > > };
> > > 
> > > Clearly we cannot hide this definition in accessors, because then
> > > accesses to value* won't see the annotation.
> > 
> > #define __rcu __atomic
> 
> Yeah, except we don't use __rcu all that consistently; in fact I don't
> know if I ever added it.

There are more than 300 of them in the kernel.  Plus sparse can be
convinced to yell at you if you don't use them.  So lack of __rcu could
be fixed without too much trouble.

The C/C++11 need to annotate functions that take arguments or return
values taken from rcu_dereference() is another story.  But the compilers
have to get significantly more aggressive or developers have to be doing
unusual things that result in rcu_dereference() returning something whose
value the compiler can predict exactly.

							Thanx, Paul



More information about the Linuxppc-dev mailing list