[PATCH 13/17] powerpc/qspinlock: trylock and initial lock attempt may steal
Nicholas Piggin
npiggin at gmail.com
Thu Nov 10 22:35:29 AEDT 2022
On Thu Nov 10, 2022 at 10:43 AM AEST, Jordan Niethe wrote:
> On Thu, 2022-07-28 at 16:31 +1000, Nicholas Piggin wrote:
> [resend as utf-8, not utf-7]
> > This gives trylock slightly more strength, and it also gives most
> > of the benefit of passing 'val' back through the slowpath without
> > the complexity.
> > ---
> > arch/powerpc/include/asm/qspinlock.h | 39 +++++++++++++++++++++++++++-
> > arch/powerpc/lib/qspinlock.c | 9 +++++++
> > 2 files changed, 47 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h
> > index 44601b261e08..d3d2039237b2 100644
> > --- a/arch/powerpc/include/asm/qspinlock.h
> > +++ b/arch/powerpc/include/asm/qspinlock.h
> > @@ -5,6 +5,8 @@
> > #include <linux/compiler.h>
> > #include <asm/qspinlock_types.h>
> >
> > +#define _Q_SPIN_TRY_LOCK_STEAL 1
>
> Would this be a config option?
I think probably not, it's more to keep the other code variant there if
we want to try tune experiment with it. We might end up cutting out a
bunch of these options if we narrow down on a good configuration.
>
> > +
> > static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
> > {
> > return READ_ONCE(lock->val);
> > @@ -26,11 +28,12 @@ static __always_inline u32 queued_spin_get_locked_val(void)
> > return _Q_LOCKED_VAL | (smp_processor_id() << _Q_OWNER_CPU_OFFSET);
> > }
> >
> > -static __always_inline int queued_spin_trylock(struct qspinlock *lock)
> > +static __always_inline int __queued_spin_trylock_nosteal(struct qspinlock *lock)
> > {
> > u32 new = queued_spin_get_locked_val();
> > u32 prev;
> >
> > + /* Trylock succeeds only when unlocked and no queued nodes */
> > asm volatile(
> > "1: lwarx %0,0,%1,%3 # queued_spin_trylock \n"
>
> s/queued_spin_trylock/__queued_spin_trylock_nosteal
I wanted to keep those because they (can be) inlined into the wider
kernel, so you'd rather see queued_spin_trylock than this internal name.
> > " cmpwi 0,%0,0 \n"
> > @@ -49,6 +52,40 @@ static __always_inline int queued_spin_trylock(struct qspinlock *lock)
> > return 0;
> > }
> >
> > +static __always_inline int __queued_spin_trylock_steal(struct qspinlock *lock)
> > +{
> > + u32 new = queued_spin_get_locked_val();
> > + u32 prev, tmp;
> > +
> > + /* Trylock may get ahead of queued nodes if it finds unlocked */
> > + asm volatile(
> > +"1: lwarx %0,0,%2,%5 # queued_spin_trylock \n"
>
> s/queued_spin_trylock/__queued_spin_trylock_steal
>
> > +" andc. %1,%0,%4 \n"
> > +" bne- 2f \n"
> > +" and %1,%0,%4 \n"
> > +" or %1,%1,%3 \n"
> > +" stwcx. %1,0,%2 \n"
> > +" bne- 1b \n"
> > +"\t" PPC_ACQUIRE_BARRIER " \n"
> > +"2: \n"
>
> Just because there's a little bit more going on here...
>
> Q_TAIL_CPU_MASK = 0xFFFE0000
> ~Q_TAIL_CPU_MASK = 0x1FFFF
>
>
> 1: lwarx prev, 0, &lock->val, IS_ENABLED_PPC64
> andc. tmp, prev, _Q_TAIL_CPU_MASK (tmp = prev & ~_Q_TAIL_CPU_MASK)
> bne- 2f (exit if locked)
> and tmp, prev, _Q_TAIL_CPU_MASK (tmp = prev & _Q_TAIL_CPU_MASK)
> or tmp, tmp, new (tmp |= new)
> stwcx. tmp, 0, &lock->val
>
> bne- 1b
> PPC_ACQUIRE_BARRIER
> 2:
>
> ... which seems correct.
Thanks,
Nick
More information about the Linuxppc-dev
mailing list