[PATCH v3 23/33] KVM: PPC: Book3S HV: Introduce rmap to track nested guest mappings

David Gibson david at gibson.dropbear.id.au
Wed Oct 3 15:56:37 AEST 2018


On Tue, Oct 02, 2018 at 09:31:22PM +1000, Paul Mackerras wrote:
> From: Suraj Jitindar Singh <sjitindarsingh at gmail.com>
> 
> When a host (L0) page which is mapped into a (L1) guest is in turn
> mapped through to a nested (L2) guest we keep a reverse mapping (rmap)
> so that these mappings can be retrieved later.
> 
> Whenever we create an entry in a shadow_pgtable for a nested guest we
> create a corresponding rmap entry and add it to the list for the
> L1 guest memslot at the index of the L1 guest page it maps. This means
> at the L1 guest memslot we end up with lists of rmaps.
> 
> When we are notified of a host page being invalidated which has been
> mapped through to a (L1) guest, we can then walk the rmap list for that
> guest page, and find and invalidate all of the corresponding
> shadow_pgtable entries.
> 
> In order to reduce memory consumption, we compress the information for
> each rmap entry down to 52 bits -- 12 bits for the LPID and 40 bits
> for the guest real page frame number -- which will fit in a single
> unsigned long.  To avoid a scenario where a guest can trigger
> unbounded memory allocations, we scan the list when adding an entry to
> see if there is already an entry with the contents we need.  This can
> occur, because we don't ever remove entries from the middle of a list.
> 
> A struct nested guest rmap is a list pointer and an rmap entry;
> ----------------
> | next pointer |
> ----------------
> | rmap entry   |
> ----------------
> 
> Thus the rmap pointer for each guest frame number in the memslot can be
> either NULL, a single entry, or a pointer to a list of nested rmap entries.
> 
> gfn	 memslot rmap array
>  	-------------------------
>  0	| NULL			|	(no rmap entry)
>  	-------------------------
>  1	| single rmap entry	|	(rmap entry with low bit set)
>  	-------------------------
>  2	| list head pointer	|	(list of rmap entries)
>  	-------------------------
> 
> The final entry always has the lowest bit set and is stored in the next
> pointer of the last list entry, or as a single rmap entry.
> With a list of rmap entries looking like;
> 
> -----------------	-----------------	-------------------------
> | list head ptr	| ----> | next pointer	| ---->	| single rmap entry	|
> -----------------	-----------------	-------------------------
> 			| rmap entry	|	| rmap entry		|
> 			-----------------	-------------------------
> 
> Signed-off-by: Suraj Jitindar Singh <sjitindarsingh at gmail.com>
> Signed-off-by: Paul Mackerras <paulus at ozlabs.org>
> ---
>  arch/powerpc/include/asm/kvm_book3s.h    |   3 +
>  arch/powerpc/include/asm/kvm_book3s_64.h |  70 ++++++++++++++++-
>  arch/powerpc/kvm/book3s_64_mmu_radix.c   |  44 +++++++----
>  arch/powerpc/kvm/book3s_hv.c             |   1 +
>  arch/powerpc/kvm/book3s_hv_nested.c      | 130 ++++++++++++++++++++++++++++++-
>  5 files changed, 233 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_book3s.h b/arch/powerpc/include/asm/kvm_book3s.h
> index d983778..1d2286d 100644
> --- a/arch/powerpc/include/asm/kvm_book3s.h
> +++ b/arch/powerpc/include/asm/kvm_book3s.h
> @@ -196,6 +196,9 @@ extern int kvmppc_mmu_radix_translate_table(struct kvm_vcpu *vcpu, gva_t eaddr,
>  			int table_index, u64 *pte_ret_p);
>  extern int kvmppc_mmu_radix_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
>  			struct kvmppc_pte *gpte, bool data, bool iswrite);
> +extern void kvmppc_unmap_pte(struct kvm *kvm, pte_t *pte, unsigned long gpa,
> +			unsigned int shift, struct kvm_memory_slot *memslot,
> +			unsigned int lpid);
>  extern bool kvmppc_hv_handle_set_rc(struct kvm *kvm, pgd_t *pgtable,
>  				    bool writing, unsigned long gpa,
>  				    unsigned int lpid);
> diff --git a/arch/powerpc/include/asm/kvm_book3s_64.h b/arch/powerpc/include/asm/kvm_book3s_64.h
> index 5496152..38614f0 100644
> --- a/arch/powerpc/include/asm/kvm_book3s_64.h
> +++ b/arch/powerpc/include/asm/kvm_book3s_64.h
> @@ -53,6 +53,66 @@ struct kvm_nested_guest {
>  	struct kvm_nested_guest *next;
>  };
>  
> +/*
> + * We define a nested rmap entry as a single 64-bit quantity
> + * 0xFFF0000000000000	12-bit lpid field
> + * 0x000FFFFFFFFFF000	40-bit guest physical address field

I thought we could potentially support guests with >1TiB of RAM..?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20181003/3704a810/attachment-0001.sig>


More information about the Linuxppc-dev mailing list