[RFC/PATCH 2/3] pseries/kvm: Clear PSSCR[ESL|EC] bits before guest entry

Gautham R Shenoy ego at linux.vnet.ibm.com
Tue Apr 7 23:25:26 AEST 2020


Hello David,

On Mon, Apr 06, 2020 at 07:58:19PM +1000, David Gibson wrote:
> On Fri, Apr 03, 2020 at 03:01:03PM +0530, Gautham R Shenoy wrote:
> > On Fri, Apr 03, 2020 at 12:20:26PM +1000, Nicholas Piggin wrote:
> > > Gautham R. Shenoy's on March 31, 2020 10:10 pm:
> > > > From: "Gautham R. Shenoy" <ego at linux.vnet.ibm.com>
> > > > 
> > > > ISA v3.0 allows the guest to execute a stop instruction. For this, the
> > > > PSSCR[ESL|EC] bits need to be cleared by the hypervisor before
> > > > scheduling in the guest vCPU.
> > > > 
> > > > Currently we always schedule in a vCPU with PSSCR[ESL|EC] bits
> > > > set. This patch changes the behaviour to enter the guest with
> > > > PSSCR[ESL|EC] bits cleared. This is a RFC patch where we
> > > > unconditionally clear these bits. Ideally this should be done
> > > > conditionally on platforms where the guest stop instruction has no
> > > > Bugs (starting POWER9 DD2.3).
> > > 
> > > How will guests know that they can use this facility safely after your
> > > series? You need both DD2.3 and a patched KVM.
> > 
> > 
> > Yes, this is something that isn't addressed in this series (mentioned
> > in the cover letter), which is a POC demonstrating that the stop0lite
> > state in guest works.
> > 
> > However, to answer your question, this is the scheme that I had in
> > mind :
> > 
> > OPAL:
> >    On Procs >= DD2.3 : we publish a dt-cpu-feature "idle-stop-guest"
> > 
> > Hypervisor Kernel:
> >     1. If "idle-stop-guest" dt-cpu-feature is discovered, then
> >        we set bool enable_guest_stop = true;
> > 
> >     2. During KVM guest entry, clear PSSCR[ESL|EC] iff
> >        enable_guest_stop == true.
> > 
> >     3. In kvm_vm_ioctl_check_extension(), for a new capability
> >        KVM_CAP_STOP, return true iff enable_guest_top == true.
> > 
> > QEMU:
> >    Check with the hypervisor if KVM_CAP_STOP is present. If so,
> >    indicate the presence to the guest via device tree.
> 
> Nack.  Presenting different capabilities to the guest depending on
> host capabilities (rather than explicit options) is never ok.  It
> means that depending on the system you start on you may or may not be
> able to migrate to other systems that you're supposed to be able to,

I agree that blocking migration for the unavailability of this feature
is not desirable. Could you point me to some other capabilities in KVM
which have been implemented via explicit options?

The ISA 3.0 allows the guest to execute the "stop" instruction. If the
Hypervisor hasn't cleared the PSSCR[ESL|EC] then, guest executing the
"stop" instruction in the causes a Hypervisor Facility Unavailable
Exception, thus giving the hypervisor a chance to emulate the
instruction. However, in the current code, when the hypervisor
receives this exception, it sends a PROGKILL to the guest which
results in crashing the guest.

Patch 1 of this series emulates wakeup from the "stop"
instruction. Would the following scheme be ok?

OPAL:
	On Procs >= DD2.3 : we publish a dt-cpu-feature "idle-stop-guest"

Hypervisor Kernel:

	   If "idle-stop-guest" dt feature is available, then, before
	   entering the guest, the hypervisor clears the PSSCR[EC|ESL]
	   bits allowing the guest to safely execute stop instruction.

	   If "idle-stop-guest" dt feature is not available, then, the
	   Hypervisor sets the PSSCR[ESL|EC] bits, thereby causing a
	   guest "stop" instruction execution to trap back into the
	   hypervisor. We then emulate a wakeup from the stop
	   instruction (Patch 1 of this series).

Guest Kernel:
      If (cpu_has_feature(CPU_FTR_ARCH_300)) only then use the
      stop0lite cpuidle state.

This allows us to migrate the KVM guest across any POWER9
Hypervisor. The minimal addition that the Hypervisor would need is
Patch 1 of this series.
	   



--
Thanks and Regards
gautham.



More information about the Linuxppc-dev mailing list