[RFC PATCH 2/3] powerpc/mm/iommu: Allow large IOMMU page size only for hugetlb backing

David Gibson david at gibson.dropbear.id.au
Tue Sep 4 14:06:43 AEST 2018


On Mon, Sep 03, 2018 at 10:07:32PM +0530, Aneesh Kumar K.V wrote:
> THP pages can get split during different code paths. An incremented reference
> count do imply we will not split the compound page. But the pmd entry can be
> converted to level 4 pte entries. Keep the code simpler by allowing large
> IOMMU page size only if the guest ram is backed by hugetlb pages.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>

So, I oked this in earlier discussion, but I had another thought and
now I'm not so sure.


> ---
>  arch/powerpc/mm/mmu_context_iommu.c | 16 ++--------------
>  1 file changed, 2 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c
> index c9ee9e23845f..f472965f7638 100644
> --- a/arch/powerpc/mm/mmu_context_iommu.c
> +++ b/arch/powerpc/mm/mmu_context_iommu.c
> @@ -212,21 +212,9 @@ long mm_iommu_get(struct mm_struct *mm, unsigned long ua, unsigned long entries,
>  		}
>  populate:
>  		pageshift = PAGE_SHIFT;
> -		if (mem->pageshift > PAGE_SHIFT && PageCompound(page)) {
> -			pte_t *pte;
> +		if (mem->pageshift > PAGE_SHIFT && PageHuge(page)) {

We can definitely only support large IOMMU pages with static
hugepages, not THPs, so the change from PageCompound to PageHuge is
definitely correct and a good idea.

>  			struct page *head = compound_head(page);
> -			unsigned int compshift = compound_order(head);
> -			unsigned int pteshift;
> -
> -			local_irq_save(flags); /* disables as well */
> -			pte = find_linux_pte(mm->pgd, cur_ua, NULL, &pteshift);
> -
> -			/* Double check it is still the same pinned page */
> -			if (pte && pte_page(*pte) == head &&
> -			    pteshift == compshift + PAGE_SHIFT)
> -				pageshift = max_t(unsigned int, pteshift,
> -						PAGE_SHIFT);
> -			local_irq_restore(flags);
> +			pageshift = compound_order(head) + PAGE_SHIFT;

But, my concern with this part is: are we totally certain there's no
way to get part of a hugetlbfs page mapped with regular sized PTEs
(probably in addition to the expected hugetlb mapping).

I'm thinking weirdness like mremap(), mapping another hugetlb using
process's address space via /proc/*/mem or maybe something even more
exotic.

Now, it's possible that we don't really care here - even if it's not
technically right for this mapping, we could argue that as long as the
process has access to part of the hugepage, the whole thing is fair
game for a DMA mapping.  In that case merely double checking that this
mapping is properly aligned would suffice (i.e. that:
    (ua >> PAGE_SHIFT) == (page's index within the compound page)

>  		}
>  		mem->pageshift = min(mem->pageshift, pageshift);
>  		mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20180904/16b39a04/attachment-0001.sig>


More information about the Linuxppc-dev mailing list