[PATCH] powerpc: Improve IRQ radix tree locking

Benjamin Herrenschmidt benh at kernel.crashing.org
Sat Aug 26 08:54:23 EST 2006


On Fri, 2006-08-25 at 09:42 -0500, Milton Miller wrote:
> On Fri Aug 25 16:04:52 EST 2006,  Benjamin Herrenschmidt wrote:
> > When reworking the powerpc irq code, I figured out that we were using
> > the radix tree in a racy way. As a temporary fix, I put a spinlock in
> > there. However, this can have a significant impact on performances. 
> > This
> > patch reworks that to use a smarter technique based on the fact that
> > what we need is in fact a rwlock with extremely rare writers (thus
> > optimized for the read path).
> 
> 
> > +static void irq_radix_wrlock(unsigned long *flags)
> > +{
> > +       unsigned int cpu, ok;
> > +
> > +       spin_lock_irqsave(&irq_big_lock, *flags);
> > +       irq_radix_writer = 1;
> 
> I think the smp_mb() is needed after here
> 
> > +       do {
> > +               ok = 1;
> > +               smp_mb();
> 
> but not every time we poll the cpus.  We are only
> updating our local variable ok which we have not
> given anyone else.   It could even be a register.

Doesn't really matter, the write case is extremely rare and not at all
performance critical. If I don't put the mb in the loop, I wonder if the
compiler might start optimizing out the reads... and I don't want to add
a volatile in there. I feel safer with the smp_mb() in there.

> > +               for_each_possible_cpu(cpu) {
> > +                       if (per_cpu(irq_radix_reader, cpu)) {
> > +                               ok = 0;
> > +                               break;
> > +                       }
> > +               }
> > +               if (!ok)
> > +                       cpu_relax();
> 
> Hmmm.  the gcc barrier is conditional.   How about putting
> barrier() before the ok=1 at the top, to avoid any optimization
> blockage but still telling gcc you must read every time?

Yeah, well, as I said, smp_mb() hits both cases and this isn't perf.
critical... 

> > +       } while(!ok);
> > +}
> > +
> 
> Oh, and how about some (un)likely in irq_radix_rdlock ?  It
> could happen by default by why not show our expected path?

Yup, that would make sense. I might also drop the first test of
irq_radix_writer. The will penalize a bit more the contention case vs.
the read case but that's exactly what I want...

> And, while I'm thinking of changes, when set ok=0 above add:
> #ifdef CONFIG_SPINLOCK_DEBUG
>                                BUG_ON(cpu == smp_processor_id());
> #endif

Do you see any case where I might actually hit that ? :)

> Hmm... just thought of something .... we are spinning under
> irq lock.  We are waiting on some other thread, but not
> telling the hypervisor.   Should we confer our execution
> over to the cpu we are waiting on?  I know, it sounds ugly
> to call that code.   I guess we would want a
> cpu_waiting_on(cpu) that calls cpu_relax and/or hypervisor.

I'm not sure this will really matter in real life. The write case is
really so rare and the contention case even more...

Ben.





More information about the Linuxppc-dev mailing list