[RFC PATCH 1/2] mm/pgtable: use ptdesc for pmd_huge_pte

Alex Shi seakeel at gmail.com
Tue Dec 16 01:26:42 AEDT 2025



On 2025/12/15 14:06, Christophe Leroy (CS GROUP) wrote:
> 
> Le 14/12/2025 à 07:55, alexs at kernel.org a écrit :
>> From: Alex Shi <alexs at kernel.org>
>>
>> 'pmd_huge_pte' are pgtable variables, but used 'pgtable->lru'
>> instead of pgtable->pt_list in pgtable_trans_huge_deposit/withdraw
>> functions, That's a bit weird.
>>
>> So let's convert the pgtable_t to precise 'struct ptdesc *' for
>> ptdesc->pmd_huge_pte, and mm->pmd_huge_pte, then convert function
>> pgtable_trans_huge_deposit() to use correct ptdesc.
>>
>> This convertion works for most of arch, but failed on s390/sparc/powerpc
>> since they use 'pte_t *' as pgtable_t. Is there any suggestion for these
>> archs? If we could have a solution, we may remove the pgtable_t for other
>> archs.
> 
> The use of struct ptdesc * assumes that a pagetable is contained in one 
> (or several) page(s).
> 
> On powerpc, there can be several page tables in one page. For instance, 
> on powerpc 8xx the hardware require page tables to be 4k at all time, 
> allthough page sizes can be either 4k or 16k. So in the 16k case there 
> are 4 pages tables in one page.

Hi Christophe,

Thanks a lot for the info.

> 
> There is some logic in arch/powerpc/mm/pgtable-frag.c to handle that but 
> this is only for last levels (PTs and PMDs). For other levels 
> kmem_cache_alloc() is used to provide a PxD of the right size. Maybe the 
> solution is to convert all levels to using pgtable-frag, but this 
> doesn't look trivial. Probably it should be done at core level not at 
> arch level.

Uh, glad to hear some idea for this, would you like to give more 
detailed explanation of your ideas?

Thanks a lot

> 
> Christophe
> 
>>
>> Signed-off-by: Alex Shi <alexs at kernel.org>
>> ---
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/ 
>> powerpc/include/asm/book3s/64/pgtable.h
>> index aac8ce30cd3b..f10736af296d 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -1320,11 +1320,11 @@ pud_t pudp_huge_get_and_clear_full(struct 
>> vm_area_struct *vma,
>>   #define __HAVE_ARCH_PGTABLE_DEPOSIT
>>   static inline void pgtable_trans_huge_deposit(struct mm_struct *mm,
>> -                          pmd_t *pmdp, pgtable_t pgtable)
>> +                          pmd_t *pmdp, struct ptdesc *pgtable)
>>   {
>>       if (radix_enabled())
>> -        return radix__pgtable_trans_huge_deposit(mm, pmdp, pgtable);
>> -    return hash__pgtable_trans_huge_deposit(mm, pmdp, pgtable);
>> +        return radix__pgtable_trans_huge_deposit(mm, pmdp, 
>> page_ptdesc(pgtable));
>> +    return hash__pgtable_trans_huge_deposit(mm, pmdp, 
>> page_ptdesc(pgtable));
>>   }
> 
> I can't understand this change.
> 
> pgtable is a pointer to a page table, and you want to replace it to 
> something that returns a pointer to a struct page, how can it work ?

Sorry for the bothering. Right, it can't work as I mentioned in commit 
log. I just want to bring up this issue, and hope you expert to give 
some ideas.

Thanks


More information about the Linuxppc-dev mailing list