[PATCH kernel v2] KVM: PPC: Optimize clearing TCEs for sparse tables

Paul Mackerras paulus at ozlabs.org
Mon Oct 22 08:52:11 AEDT 2018


On Mon, Oct 15, 2018 at 09:08:41PM +1100, Alexey Kardashevskiy wrote:
> The powernv platform maintains 2 TCE tables for VFIO - a hardware TCE
> table and a table with userspace addresses. These tables are radix trees,
> we allocate indirect levels when they are written to. Since
> the memory allocation is problematic in real mode, we have 2 accessors
> to the entries:
> - for virtual mode: it allocates the memory and it is always expected
> to return non-NULL;
> - fr real mode: it does not allocate and can return NULL.
> 
> Also, DMA windows can span to up to 55 bits of the address space and since
> we never have this much RAM, such windows are sparse. However currently
> the SPAPR TCE IOMMU driver walks through all TCEs to unpin DMA memory.
> 
> Since we maintain a userspace addresses table for VFIO which is a mirror
> of the hardware table, we can use it to know which parts of the DMA
> window have not been mapped and skip these so does this patch.
> 
> The bare metal systems do not have this problem as they use a bypass mode
> of a PHB which maps RAM directly.
> 
> This helps a lot with sparse DMA windows, reducing the shutdown time from
> about 3 minutes per 1 billion TCEs to a few seconds for 32GB sparse guest.
> Just skipping the last level seems to be good enough.
> 
> As non-allocating accessor is used now in virtual mode as well, rename it
> from IOMMU_TABLE_USERSPACE_ENTRY_RM (real mode) to _RO (read only).
> 
> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>

Thanks, applied to my kvm-ppc-next branch, and now in the kvm next
branch also.

Paul.


More information about the Linuxppc-dev mailing list