[PATCH 00/19] KVM: PPC: Book3S HV: add XIVE native exploitation mode
Paul Mackerras
paulus at ozlabs.org
Tue Jan 22 15:46:54 AEDT 2019
On Mon, Jan 07, 2019 at 07:43:12PM +0100, Cédric Le Goater wrote:
> Hello,
>
> On the POWER9 processor, the XIVE interrupt controller can control
> interrupt sources using MMIO to trigger events, to EOI or to turn off
> the sources. Priority management and interrupt acknowledgment is also
> controlled by MMIO in the CPU presenter subengine.
>
> PowerNV/baremetal Linux runs natively under XIVE but sPAPR guests need
> special support from the hypervisor to do the same. This is called the
> XIVE native exploitation mode and today, it can be activated under the
> PowerPC Hypervisor, pHyp. However, Linux/KVM lacks XIVE native support
> and still offers the old interrupt mode interface using a
> XICS-over-XIVE glue which implements the XICS hcalls.
>
> The following series is proposal to add the same support under KVM.
>
> A new KVM device is introduced for the XIVE native exploitation
> mode. It reuses most of the XICS-over-XIVE glue implementation
> structures which are internal to KVM but has a completely different
> interface. A set of Hypervisor calls configures the sources and the
> event queues and from there, all control is done by the guest through
> MMIOs.
>
> These MMIO regions (ESB and TIMA) are exposed to guests in QEMU,
> similarly to VFIO, and the associated VMAs are populated dynamically
> with the appropriate pages using a fault handler. This is implemented
> with a couple of KVM device ioctls.
>
> On a POWER9 sPAPR machine, the Client Architecture Support (CAS)
> negotiation process determines whether the guest operates with a
> interrupt controller using the XICS legacy model, as found on POWER8,
> or in XIVE exploitation mode. Which means that the KVM interrupt
> device should be created at runtime, after the machine as started.
> This requires extra KVM support to create/destroy KVM devices. The
> last patches are an attempt to solve that problem.
>
> Migration has its own specific needs. The patchset provides the
> necessary routines to quiesce XIVE, to capture and restore the state
> of the different structures used by KVM, OPAL and HW. Extra OPAL
> support is required for these.
Thanks for the patchset. It mostly looks good, but there are some
more things we need to consider, and I think a v2 will be needed.
One general comment I have is that there are a lot of acronyms in this
code and you mostly seem to assume that people will know what they all
mean. It would make the code more readable if you provide the
expansion of the acronym on first use in a comment or whatever. For
example, one of the patches in this series talks about the "EAS"
without ever expanding it in any comment or in the patch description,
and I have forgotten just at the moment what EAS stands for (I just
know that understanding the XIVE is not eas-y. :)
Another general comment is that you seem to have written all this
code assuming we are using HV KVM in a host running bare-metal.
However, we could be using PR KVM (either in a bare-metal host or in a
guest), or we could be doing nested HV KVM where we are using the
kvm_hv module inside a KVM guest and using special hypercalls for
controlling our guests.
It would be perfectly acceptable for now to say that we don't yet
support XIVE exploitation in those scenarios, as long as we then make
sure that the new KVM capability reports false in those scenarios, and
any attempt to use the XIVE exploitation interfaces fails cleanly.
I don't see that either of those is true in the patch set as it
stands, so that is one area that needs to be fixed.
A third general comment is that the new KVM interfaces you have added
need to be documented in the files under Documentation/virtual/kvm.
Paul.
More information about the Linuxppc-dev
mailing list