[RFC/PATCH 2/3] pseries/kvm: Clear PSSCR[ESL|EC] bits before guest entry

David Gibson david at gibson.dropbear.id.au
Mon Apr 6 19:58:19 AEST 2020


On Fri, Apr 03, 2020 at 03:01:03PM +0530, Gautham R Shenoy wrote:
> On Fri, Apr 03, 2020 at 12:20:26PM +1000, Nicholas Piggin wrote:
> > Gautham R. Shenoy's on March 31, 2020 10:10 pm:
> > > From: "Gautham R. Shenoy" <ego at linux.vnet.ibm.com>
> > > 
> > > ISA v3.0 allows the guest to execute a stop instruction. For this, the
> > > PSSCR[ESL|EC] bits need to be cleared by the hypervisor before
> > > scheduling in the guest vCPU.
> > > 
> > > Currently we always schedule in a vCPU with PSSCR[ESL|EC] bits
> > > set. This patch changes the behaviour to enter the guest with
> > > PSSCR[ESL|EC] bits cleared. This is a RFC patch where we
> > > unconditionally clear these bits. Ideally this should be done
> > > conditionally on platforms where the guest stop instruction has no
> > > Bugs (starting POWER9 DD2.3).
> > 
> > How will guests know that they can use this facility safely after your
> > series? You need both DD2.3 and a patched KVM.
> 
> 
> Yes, this is something that isn't addressed in this series (mentioned
> in the cover letter), which is a POC demonstrating that the stop0lite
> state in guest works.
> 
> However, to answer your question, this is the scheme that I had in
> mind :
> 
> OPAL:
>    On Procs >= DD2.3 : we publish a dt-cpu-feature "idle-stop-guest"
> 
> Hypervisor Kernel:
>     1. If "idle-stop-guest" dt-cpu-feature is discovered, then
>        we set bool enable_guest_stop = true;
> 
>     2. During KVM guest entry, clear PSSCR[ESL|EC] iff
>        enable_guest_stop == true.
> 
>     3. In kvm_vm_ioctl_check_extension(), for a new capability
>        KVM_CAP_STOP, return true iff enable_guest_top == true.
> 
> QEMU:
>    Check with the hypervisor if KVM_CAP_STOP is present. If so,
>    indicate the presence to the guest via device tree.

Nack.  Presenting different capabilities to the guest depending on
host capabilities (rather than explicit options) is never ok.  It
means that depending on the system you start on you may or may not be
able to migrate to other systems that you're supposed to be able to,

> Guest Kernel:
>    Check for the presence of guest stop state support in
>    device-tree. If available, enable the stop0lite in the cpuidle
>    driver. 
>    
> 
> We still have a challenge of migrating a guest which started on a
> hypervisor supporting guest stop state to a hypervisor without it.
> The target hypervisor should atleast have Patch 1 of this series, so
> that we don't crash the guest.
> 
> > 
> > > 
> > > Signed-off-by: Gautham R. Shenoy <ego at linux.vnet.ibm.com>
> > > ---
> > >  arch/powerpc/kvm/book3s_hv.c            |  2 +-
> > >  arch/powerpc/kvm/book3s_hv_rmhandlers.S | 25 +++++++++++++------------
> > >  2 files changed, 14 insertions(+), 13 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> > > index cdb7224..36d059a 100644
> > > --- a/arch/powerpc/kvm/book3s_hv.c
> > > +++ b/arch/powerpc/kvm/book3s_hv.c
> > > @@ -3424,7 +3424,7 @@ static int kvmhv_load_hv_regs_and_go(struct kvm_vcpu *vcpu, u64 time_limit,
> > >  	mtspr(SPRN_IC, vcpu->arch.ic);
> > >  	mtspr(SPRN_PID, vcpu->arch.pid);
> > >  
> > > -	mtspr(SPRN_PSSCR, vcpu->arch.psscr | PSSCR_EC |
> > > +	mtspr(SPRN_PSSCR, (vcpu->arch.psscr  & ~(PSSCR_EC | PSSCR_ESL)) |
> > >  	      (local_paca->kvm_hstate.fake_suspend << PSSCR_FAKE_SUSPEND_LG));
> > >  
> > >  	mtspr(SPRN_HFSCR, vcpu->arch.hfscr);
> > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > index dbc2fec..c2daec3 100644
> > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
> > > @@ -823,6 +823,18 @@ END_FTR_SECTION_IFCLR(CPU_FTR_ARCH_207S)
> > >  	mtspr	SPRN_PID, r7
> > >  	mtspr	SPRN_WORT, r8
> > >  BEGIN_FTR_SECTION
> > > +	/* POWER9-only registers */
> > > +	ld	r5, VCPU_TID(r4)
> > > +	ld	r6, VCPU_PSSCR(r4)
> > > +	lbz	r8, HSTATE_FAKE_SUSPEND(r13)
> > > +	lis 	r7, (PSSCR_EC | PSSCR_ESL)@h /* Allow guest to call stop */
> > > +	andc	r6, r6, r7
> > > +	rldimi	r6, r8, PSSCR_FAKE_SUSPEND_LG, 63 - PSSCR_FAKE_SUSPEND_LG
> > > +	ld	r7, VCPU_HFSCR(r4)
> > > +	mtspr	SPRN_TIDR, r5
> > > +	mtspr	SPRN_PSSCR, r6
> > > +	mtspr	SPRN_HFSCR, r7
> > > +FTR_SECTION_ELSE
> > 
> > Why did you move these around? Just because the POWER9 section became
> > larger than the other?
> 
> Yes.
> 
> > 
> > That's a real wart in the instruction patching implementation, I think
> > we can fix it by padding with nops in the macros.
> > 
> > Can you just add the additional required nops to the top branch without
> > changing them around for this patch, so it's easier to see what's going
> > on? The end result will be the same after patching. Actually changing
> > these around can have a slight unintended consequence in that code that
> > runs before features were patched will execute the IF code. Not a
> > problem here, but another reason why the instruction patching 
> > restriction is annoying.
> 
> Sure, I will repost this patch with additional nops instead of
> moving them around.
> 
> > 
> > Thanks,
> > Nick
> > 
> > >  	/* POWER8-only registers */
> > >  	ld	r5, VCPU_TCSCR(r4)
> > >  	ld	r6, VCPU_ACOP(r4)
> > > @@ -833,18 +845,7 @@ BEGIN_FTR_SECTION
> > >  	mtspr	SPRN_CSIGR, r7
> > >  	mtspr	SPRN_TACR, r8
> > >  	nop
> > > -FTR_SECTION_ELSE
> > > -	/* POWER9-only registers */
> > > -	ld	r5, VCPU_TID(r4)
> > > -	ld	r6, VCPU_PSSCR(r4)
> > > -	lbz	r8, HSTATE_FAKE_SUSPEND(r13)
> > > -	oris	r6, r6, PSSCR_EC at h	/* This makes stop trap to HV */
> > > -	rldimi	r6, r8, PSSCR_FAKE_SUSPEND_LG, 63 - PSSCR_FAKE_SUSPEND_LG
> > > -	ld	r7, VCPU_HFSCR(r4)
> > > -	mtspr	SPRN_TIDR, r5
> > > -	mtspr	SPRN_PSSCR, r6
> > > -	mtspr	SPRN_HFSCR, r7
> > > -ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
> > > +ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
> > >  8:
> > >  
> > >  	ld	r5, VCPU_SPRG0(r4)
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20200406/4077f36d/attachment-0001.sig>


More information about the Linuxppc-dev mailing list