[PATCH] KVM: PPC: Book3S PR: only call slbmte for valid SLB entries
Michael Ellerman
mpe at ellerman.id.au
Wed Sep 27 13:25:02 AEST 2017
Paul Mackerras <paulus at ozlabs.org> writes:
> On Tue, Sep 26, 2017 at 03:24:05PM +1000, Michael Ellerman wrote:
>> David Gibson <david at gibson.dropbear.id.au> writes:
>>
>> > On Fri, Sep 22, 2017 at 11:34:29AM +0200, Greg Kurz wrote:
>> >> Userland passes an array of 64 SLB descriptors to KVM_SET_SREGS,
>> >> some of which are valid (ie, SLB_ESID_V is set) and the rest are
>> >> likely all-zeroes (with QEMU at least).
>> >>
>> >> Each of them is then passed to kvmppc_mmu_book3s_64_slbmte(), which
>> >> assumes to find the SLB index in the 3 lower bits of its rb argument.
>> >> When passed zeroed arguments, it happily overwrites the 0th SLB entry
>> >> with zeroes. This is exactly what happens while doing live migration
>> >> with QEMU when the destination pushes the incoming SLB descriptors to
>> >> KVM PR. When reloading the SLBs at the next synchronization, QEMU first
>> >> clears its SLB array and only restore valid ones, but the 0th one is
>> >> now gone and we cannot access the corresponding memory anymore:
>> >>
>> >> (qemu) x/x $pc
>> >> c0000000000b742c: Cannot access memory
>> >>
>> >> To avoid this, let's filter out non-valid SLB entries, like we
>> >> already do for Book3S HV.
>> >>
>> >> Signed-off-by: Greg Kurz <groug at kaod.org>
>> >
>> > This seems like a good idea, but to make it fully correct, don't we
>> > also need to fully flush the SLB before inserting the new entries.
>>
>> We would need to do that yeah.
>>
>> But I don't think I like this patch, it would mean userspace has no way
>> of programming an invalid SLB entry. It's true that in general that
>> isn't something we care about doing, but the API should allow it.
>>
>> For example the kernel could leave invalid entries in place and flip the
>> valid bit when it wanted to make them valid, and this patch would
>> prevent that state being successfully migrated IIUIC.
>
> If I remember correctly, the architecture says that slbmfee/slbmfev
> return all zeroes for an invalid entry, so there would be no way for
> the guest kernel to do what you suggest.
You're right it does.
We have code in xmon that reads entries and then checks for SLB_ESID_V,
but I guess that's just overly pessimistic.
cheers
More information about the Linuxppc-dev
mailing list