[PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries
Paul Mackerras
paulus at ozlabs.org
Tue Sep 26 21:34:38 AEST 2017
On Tue, Sep 26, 2017 at 03:24:05PM +1000, Michael Ellerman wrote:
> David Gibson <david at gibson.dropbear.id.au> writes:
>
> > On Fri, Sep 22, 2017 at 11:34:29AM +0200, Greg Kurz wrote:
> >> Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
> >> some of which are valid (ie, SLB_ESID_V is set) and the rest are
> >> likely all-zeroes (with QEMU at least).
> >>
> >> Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
> >> assumes to find the SLB index in the 3 lower bits of its rb argument.
> >> When passed zeroed arguments, it happily overwrites the 0th SLB entry
> >> with zeroes. This is exactly what happens while doing live migration
> >> with QEMU when the destination pushes the incoming SLB descriptors to
> >> KVM PR. When reloading the SLBs at the next synchronization, QEMU first
> >> clears its SLB array and only restore valid ones, but the 0th one is
> >> now gone and we cannot access the corresponding memory anymore:
> >>
> >> (qemu) x/x $pc
> >> c0000000000b742c: Cannot access memory
> >>
> >> To avoid this, let's filter out non-valid SLB entries, like we
> >> already do for Book3S HV.
> >>
> >> Signed-off-by: Greg Kurz <groug at kaod.org>
> >
> > This seems like a good idea, but to make it fully correct, don't we
> > also need to fully flush the SLB before inserting the new entries.
>
> We would need to do that yeah.
>
> But I don't think I like this patch, it would mean userspace has no way
> of programming an invalid SLB entry. It's true that in general that
> isn't something we care about doing, but the API should allow it.
>
> For example the kernel could leave invalid entries in place and flip the
> valid bit when it wanted to make them valid, and this patch would
> prevent that state being successfully migrated IIUIC.
If I remember correctly, the architecture says that slbmfee/slbmfev
return all zeroes for an invalid entry, so there would be no way for
the guest kernel to do what you suggest.
Paul.
More information about the Linuxppc-dev
mailing list