[RFC][PATCH] spin loop arch primitives for busy waiting

Linus Torvalds torvalds at linux-foundation.org
Fri Apr 7 05:41:52 AEST 2017


On Thu, Apr 6, 2017 at 12:23 PM, Peter Zijlstra <peterz at infradead.org> wrote:
>
> Something like so then. According to the SDM mwait is a no-op if we do
> not execute monitor first. So this variant should get the first
> iteration without expensive instructions.

No, the problem is that we *would* have executed a prior monitor that
could still be pending - from a previous invocation of
smp_cond_load_acquire().

Especially with spinlocks, these things can very much happen back-to-back.

And it would be pending with a different address (the previous
spinlock) that might not have changed since then (and might not be
changing), so now we might actually be pausing in mwait waiting for
that *other* thing to change.

So it would probably need to do something complicated like

  #define smp_cond_load_acquire(ptr, cond_expr)                         \
  ({                                                                    \
        typeof(ptr) __PTR = (ptr);                                      \
        typeof(*ptr) VAL;                                               \
        do {                                                            \
                VAL = READ_ONCE(*__PTR);                                \
                if (cond_expr)                                          \
                        break;                                          \
                for (;;) {                                              \
                        ___monitor(__PTR, 0, 0);                        \
                        VAL = READ_ONCE(*__PTR);                        \
                        if (cond_expr) break;                           \
                        ___mwait(0xf0 /* C0 */, 0);                     \
                }                                                       \
        } while (0)                                                     \
        smp_acquire__after_ctrl_dep();                                  \
        VAL;                                                            \
  })

which might just generate nasty enough code to not be worth it.

I dunno

             Linus


More information about the Linuxppc-dev mailing list