[PATCH][RFC] Implement arch primitives for busywait loops
Nicholas Piggin
npiggin at gmail.com
Fri Sep 16 21:52:00 AEST 2016
On Fri, 16 Sep 2016 11:30:58 +0000
David Laight <David.Laight at ACULAB.COM> wrote:
> From: Nicholas Piggin
> > Sent: 16 September 2016 09:58
> > Implementing busy wait loops with cpu_relax() in callers poses
> > some difficulties for powerpc.
> >
> > First, we want to put our SMT thread into a low priority mode for the
> > duration of the loop, but then return to normal priority after exiting
> > the loop. Dependong on the CPU design, 'HMT_low() ; HMT_medium();' as
> > cpu_relax() does may have HMT_medium take effect before HMT_low made
> > any (or much) difference.
> >
> > Second, it can be beneficial for some implementations to spin on the
> > exit condition with a statically predicted-not-taken branch (i.e.,
> > always predict the loop will exit).
> >
> > This is a quick RFC with a couple of users converted to see what
> > people think. I don't use a C branch with hints, because we don't want
> > the compiler moving the loop body out of line, which makes it a bit
> > messy unfortunately. If there's a better way to do it, I'm all ears.
>
> I think it will still all go wrong if the conditional isn't trivial.
> In particular if the condition contains || or && it is likely to
> have a branch - which could invert the loop.
I don't know that it will.
Yes, if we have exit condition that requires more branches in order to
be computed then we lose our nice property of never taking a branch
miss on loop exit. But we still avoid *this* branch miss, and still
prevent multiple iterations of the wait loop being speculatively
executed concurrently when there's no work to be done.
And C doesn't know about the loop, so it can't do any transformation
except to compute the final condition.
Or have I missed something?
Thanks,
Nick
More information about the Linuxppc-dev
mailing list