[RFC PATCH 1/2] mm/pgtable: use ptdesc for pmd_huge_pte

Christophe Leroy (CS GROUP) chleroy at kernel.org
Mon Dec 15 17:06:34 AEDT 2025



Le 14/12/2025 à 07:55, alexs at kernel.org a écrit :
> From: Alex Shi <alexs at kernel.org>
> 
> 'pmd_huge_pte' are pgtable variables, but used 'pgtable->lru'
> instead of pgtable->pt_list in pgtable_trans_huge_deposit/withdraw
> functions, That's a bit weird.
> 
> So let's convert the pgtable_t to precise 'struct ptdesc *' for
> ptdesc->pmd_huge_pte, and mm->pmd_huge_pte, then convert function
> pgtable_trans_huge_deposit() to use correct ptdesc.
> 
> This convertion works for most of arch, but failed on s390/sparc/powerpc
> since they use 'pte_t *' as pgtable_t. Is there any suggestion for these
> archs? If we could have a solution, we may remove the pgtable_t for other
> archs.

The use of struct ptdesc * assumes that a pagetable is contained in one 
(or several) page(s).

On powerpc, there can be several page tables in one page. For instance, 
on powerpc 8xx the hardware require page tables to be 4k at all time, 
allthough page sizes can be either 4k or 16k. So in the 16k case there 
are 4 pages tables in one page.

There is some logic in arch/powerpc/mm/pgtable-frag.c to handle that but 
this is only for last levels (PTs and PMDs). For other levels 
kmem_cache_alloc() is used to provide a PxD of the right size. Maybe the 
solution is to convert all levels to using pgtable-frag, but this 
doesn't look trivial. Probably it should be done at core level not at 
arch level.

Christophe

> 
> Signed-off-by: Alex Shi <alexs at kernel.org>
> ---
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index aac8ce30cd3b..f10736af296d 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -1320,11 +1320,11 @@ pud_t pudp_huge_get_and_clear_full(struct vm_area_struct *vma,
>   
>   #define __HAVE_ARCH_PGTABLE_DEPOSIT
>   static inline void pgtable_trans_huge_deposit(struct mm_struct *mm,
> -					      pmd_t *pmdp, pgtable_t pgtable)
> +					      pmd_t *pmdp, struct ptdesc *pgtable)
>   {
>   	if (radix_enabled())
> -		return radix__pgtable_trans_huge_deposit(mm, pmdp, pgtable);
> -	return hash__pgtable_trans_huge_deposit(mm, pmdp, pgtable);
> +		return radix__pgtable_trans_huge_deposit(mm, pmdp, page_ptdesc(pgtable));
> +	return hash__pgtable_trans_huge_deposit(mm, pmdp, page_ptdesc(pgtable));
>   }
>   

I can't understand this change.

pgtable is a pointer to a page table, and you want to replace it to 
something that returns a pointer to a struct page, how can it work ?

Christophe


More information about the Linuxppc-dev mailing list