[RFC PATCH] powerpc/64s: Move ISAv3.0 / POWER9 idle code to powernv C code
Nicholas Piggin
npiggin at gmail.com
Wed Jul 11 18:30:36 AEST 2018
On Tue, 10 Jul 2018 16:36:34 +0530
Gautham R Shenoy <ego at linux.vnet.ibm.com> wrote:
> Hello Nicholas,
>
>
> On Mon, Jul 09, 2018 at 12:24:36AM +1000, Nicholas Piggin wrote:
> > Reimplement POWER9 idle code in C, in the powernv platform code.
> > Assembly stubs are used to save and restore the stack frame and
> > non-volatile GPRs before going to idle, but these are small and
> > mostly agnostic to microarchitecture implementation details.
> >
>
> Thanks for this patch. It is indeed hard to maneuver through the
> current assembly code and change things without introducing new bugs.
Hey thanks for the very good review. I don't mean to disparage the
existing asm code too much :) We were doing our best there to get
something working, now I think it's a good time to step back and see
what we can improve.
> > POWER7/8 code is not converted (yet), but that's not a moving
> > target, and it doesn't make you want to claw your eyes out so
> > much with the POWER9 code untangled from it.
> >
> > The optimisation where EC=ESL=0 idle modes did not have to save
> > GPRs or mtmsrd L=0 is restored, because it's simple to do.
> >
> > Idle wakeup no longer uses the ->cpu_restore call to reinit SPRs,
> > but saves and restores them all explicitly.
>
> Right! The ->cpu_restore call in the current code is rendered
> ineffective by "restore_additional_sprs" which is called after the
> cpu_restore.
Yes, I would actually like to do an initial incremental patch for this,
and remove it from P7/8 CPU types as well. I think that's quite easy to
do, and a useful cleanup to remove it entirely.
> > diff --git a/arch/powerpc/include/asm/cpuidle.h b/arch/powerpc/include/asm/cpuidle.h
> > index e210a83eb196..b668f030d531 100644
> > --- a/arch/powerpc/include/asm/cpuidle.h
> > +++ b/arch/powerpc/include/asm/cpuidle.h
> > @@ -28,6 +28,7 @@
> > * yet woken from the winkle state.
> > */
> > #define PNV_CORE_IDLE_LOCK_BIT 0x10000000
> > +#define NR_PNV_CORE_IDLE_LOCK_BIT 28
>
> We can define PNV_CORE_IDLE_LOCK_BIT mask based on
> NR_PNV_CORE_IDLE_LOCK_BIT ?
Yep good point.
> > diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> > index 76a14702cb9c..36b5f0e18c0c 100644
> > --- a/arch/powerpc/kernel/exceptions-64s.S
> > +++ b/arch/powerpc/kernel/exceptions-64s.S
> > @@ -135,8 +135,14 @@ TRAMP_KVM(PACA_EXNMI, 0x100)
> >
> > #ifdef CONFIG_PPC_P7_NAP
> > EXC_COMMON_BEGIN(system_reset_idle_common)
> > +BEGIN_FTR_SECTION
> > + mfspr r3,SPRN_SRR1
> > + bltlr cr3 /* no state loss, return to idle caller */
>
> This is a nice optimization. But perhaps needs a carefully checked.
> We wakeup at SYSTEM_RESET only when we have executed stop via the
> mayloss() call which creates an additional stack frame. When we return
> back to the caller, we will go back with that stack frame. Perhaps we
> need to jump to a isa3_idle_wake_gpr_noloss ?
I wasn't 100% sure of all this so yes it's something I do need to
verify carefully. I did test a few different cases in mambo and stepped
through things, but could easily have bugs.
I could have got something wrong, but I think mayloss does not create
an additional stack frame, so this works okay. It's just using red zone.
... I think...
> > +_GLOBAL(isa3_idle_stop_mayloss)
> > + std r1,PACAR1(r13)
> > + mflr r4
> > + mfcr r5
> > + /* use stack red zone rather than a new frame */
> > + addi r6,r1,-INT_FRAME_SIZE
> > + SAVE_GPR(2, r6)
> > + SAVE_NVGPRS(r6)
> > + std r4,_LINK(r6)
> > + std r5,_CCR(r6)
> > + mtspr SPRN_PSSCR,r3
> > + PPC_STOP
>
>
> Suppose we enter into a isa3_idle_stop_may_loss (ESL=EC=1), but as
> soon as we execute PPC_STOP, if we get interrupted, we will go back to
> SYSTEM_RESET vector.
>
> The SRR1[46:47] should be 0x1, due to which we will do a bl there,
> thereby going back to the caller of isa3_idle_stop_mayloss. Which
> implies that we need to perform the stack unwinding there.
Well we haven't modified r1 here, so I thought it should be okay. It
seems to do the right thing in mambo.
> > +static inline void atomic_unlock_thread_idle(void)
> > +{
> > + int cpu = raw_smp_processor_id();
> > + int first = cpu_first_thread_sibling(cpu);
> > + unsigned long *state = &paca_ptrs[first]->idle_state;
> > +
> > + clear_bit_unlock(NR_PNV_CORE_IDLE_LOCK_BIT, state);
> > +}
> > +
> > +static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>
> We are not using the mmu_on anywhere in the code. Is this for
> restoring the IR|DR bit in the MSR ?
Hmm, it's supposed to be for KVM secondaries that have to have the MMU
switched off around idle/wake. I haven't got all the KVM stuff right
yet though.
Thanks,
Nick
More information about the Linuxppc-dev
mailing list