[PATCH 2/5] powerpc/64: stop using bit in HSPRG0 to test winkle
Nicholas Piggin
npiggin at gmail.com
Wed Mar 1 02:36:07 AEDT 2017
On Tue, 28 Feb 2017 20:38:19 +0530
Gautham R Shenoy <ego.lkml at gmail.com> wrote:
> Hi Nick,
>
> On Fri, Feb 17, 2017 at 12:08 AM, Nicholas Piggin <npiggin at gmail.com> wrote:
> > The POWER8 idle code has a neat trick of programming the power on engine
> > to restore a low bit into HSPRG0, so idle wakeup code can test and see
> > if it has been programmed this way and therefore lost all state. Restore
> > time can be reduced if winkle has not been reached.
> >
>
> > However this messes with our r13 PACA pointer, and requires HSPRG0 to
> > be written to. It also optimizes the slowest and most uncommon case at
> > the expense of another SPR write in the common nap state wakeup.
>
> Actually, this optimization was needed to reduce the guest entry time
> for a KVM guest vcore.
>
> While running KVM on POWER8, the host is in SMT=1 mode with the
> secondary threads
> being hotplugged out and supposed to have entered the winkle state.
> Since the primary thread is still running, the core would still be in Nap.
>
> However, each time a guest vcore is scheduled on the core, the
> secondary threads are sent an IPI to wake them up from the idle state.
>
> On waking up, they use the last bit of HSPRG0 and conclude that they
> don't need to restore resources before executing the guest entry code.
>
> Thus, the HSPRG0 trick help the secondary threads avoid spending
> extra cycles restoring the SLBs and per-thread SPRs.
Ah, thanks. Until now I didn't really understand why that case was
optimimized. My mistake and apologies for assuming it was not a good
reason :)
> > Remove this complexity and assume winkle sleeps always require a state
> > restore. This speedup could be made entirely contained within the winkle
> > idle code by counting per-core winkles and setting a thread bitmap when
> > all have gone to winkle.
>
> This is a good idea! We are anyway taking the lock in pnv_wakeup_tb_loss()
> to check if we are the first thread in the core waking up from a deep-state.
>
> We can check another bitmap if we are the first thread waking up from winkle
> and set cr4 to the result of that comparison while holding the lock.
>
> That shouldn't cost us much, at least for the fake-wakeup-from-winkle
> for KVM case alluded to above.
>
> Otherwise, the patch looks good to me.
Okay, I had a half-done patch for this. I'll finish it and send it
out for review.
Thanks,
Nick
More information about the Linuxppc-dev
mailing list