[PATCH 12/17] powerpc/qspinlock: add ability to prod new queue head CPU

Nicholas Piggin npiggin at gmail.com
Thu Nov 10 22:32:51 AEDT 2022


On Thu Nov 10, 2022 at 10:42 AM AEST, Jordan Niethe wrote:
> On Thu, 2022-07-28 at 16:31 +1000, Nicholas Piggin wrote:
> [resend as utf-8, not utf-7]
> > After the head of the queue acquires the lock, it releases the
> > next waiter in the queue to become the new head. Add an option
> > to prod the new head if its vCPU was preempted. This may only
> > have an effect if queue waiters are yielding.
> > 
> > Disable this option by default for now, i.e., no logical change.
> > ---
> >  arch/powerpc/lib/qspinlock.c | 29 ++++++++++++++++++++++++++++-
> >  1 file changed, 28 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c
> > index 28c85a2d5635..3b10e31bcf0a 100644
> > --- a/arch/powerpc/lib/qspinlock.c
> > +++ b/arch/powerpc/lib/qspinlock.c
> > @@ -12,6 +12,7 @@
> >  struct qnode {
> >  	struct qnode	*next;
> >  	struct qspinlock *lock;
> > +	int		cpu;
> >  	int		yield_cpu;
> >  	u8		locked; /* 1 if lock acquired */
> >  };
> > @@ -30,6 +31,7 @@ static bool pv_yield_owner __read_mostly = true;
> >  static bool pv_yield_allow_steal __read_mostly = false;
> >  static bool pv_yield_prev __read_mostly = true;
> >  static bool pv_yield_propagate_owner __read_mostly = true;
> > +static bool pv_prod_head __read_mostly = false;
> >  
> >  static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes);
> >  
> > @@ -392,6 +394,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b
> >  	node = &qnodesp->nodes[idx];
> >  	node->next = NULL;
> >  	node->lock = lock;
> > +	node->cpu = smp_processor_id();
>
> I suppose this could be used in some other places too.
>
> For example change:
> 	yield_to_prev(lock, node, prev, paravirt);
>
> In yield_to_prev() it could then access the prev->cpu.

That case is a bit iffy. As soon as we WRITE_ONCE to prev, the prev lock
holder can go away. It's a statically allocated array and per-CPU so it
should actually give us the right value even if that CPU queued on some
other lock again, but I think it's more straightforward just to not
touch it after that point. This is also a remote and hot cache line, so
avoiding any loads on it is nice (we have the store, but you don't have
to wait for those), and we already have val.

Thanks,
Nick


More information about the Linuxppc-dev mailing list