[RFC 1/2] mm: thp: allocate PTE page tables lazily at split time

David Hildenbrand (Arm) david at kernel.org
Fri Feb 13 02:39:41 AEDT 2026


>>
>> Is there a way to remove this? It's always been a confusing hack, now
>> it's unpleasant to have around :)
>>
> 
> Hash MMU on PowerPC works fundamentally different than other MMUs
> (unlike Radix MMU on PowerPC). So yes, it requires few tricks to fit
> into the Linux's multi-level SW page table model. ;)

:)

> 
>> In particular, seeing that radix__pgtable_trans_huge_deposit() just 1:1
>> copied generic pgtable_trans_huge_deposit() hurts my belly.
>>
> 
> On PowerPC, pgtable_t can be a pte fragment.
> 
> typedef pte_t *pgtable_t;
> 
> That means a single page can be shared among other PTE page tables. So, we
> cannot use page->lru which the generic implementation uses. I guess due
> to this, there is a slight change in implementation of
> radix__pgtable_trans_huge_deposit().

Ah, did not spot this difference, but makes sense. Still ugly, but make 
sense. Fortunately it would go away with this RFC.

> 
> Doing a grep search, I think that's the same for sparc and s390 as well.

... and I also did not realize that s390x+sparc have separate 
implementations we can now get rid of as well.

> 
>>
>> IIUC, hash is mostly used on legacy power systems, radix on newer ones.
>>
>> So one obvious solution: remove PMD THP support for hash MMUs along with
>> all this hacky deposit code.
>>
> 
> Unfortunately, please no. There are real customers using Hash MMU on
> Power9 and even on older generations and this would mean breaking Hash
> PMD THP support for them.
> 

I was expecting this answer :)

> 
>>
>> the "vma_is_anonymous(vma) && !arch_needs_pgtable_deposit()" and similar
>> checks need to be wrapped in a reasonable helper and likely this all
>> needs to get cleaned up further.
>>
>> The implementation if the generic pgtable_trans_huge_deposit and the
>> radix handlers etc must be removed. If any code would trigger them it
>> would be a bug.
>>
> 
> Sure, I think after this patch series, the radix__pgtable_trans_huge_deposit()
> will mostly be a dead code anyways. I will spend some time going
> through this series and will also give it a test on powerpc HW (with
> both Hash and Radix MMU).

Thanks! The series will grow quite a bit I think, so retesting new 
revisions will be very appreciated!

> 
> I guess, we should also look at removing pgtable_trans_huge_deposit() and
> pgtable_trans_huge_withdraw() implementations from s390 and sparc, since
> those too will be dead code after this.

Exactly.


-- 
Cheers,

David


More information about the Linuxppc-dev mailing list