[PATCH 3/3] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to CPUs in kernel mode

David Hildenbrand david at redhat.com
Fri Apr 7 01:51:52 AEST 2023

On 06.04.23 17:02, Peter Zijlstra wrote:
> On Thu, Apr 06, 2023 at 04:04:23PM +0200, Peter Zijlstra wrote:
>> On Thu, Apr 06, 2023 at 03:29:28PM +0200, Peter Zijlstra wrote:
>>> On Thu, Apr 06, 2023 at 09:38:50AM -0300, Marcelo Tosatti wrote:
>>>>> To actually hit this path you're doing something really dodgy.
>>>> Apparently khugepaged is using the same infrastructure:
>>>> $ grep tlb_remove_table khugepaged.c
>>>> 	tlb_remove_table_sync_one();
>>>> 	tlb_remove_table_sync_one();
>>>> So just enabling khugepaged will hit that path.
>>> Urgh, WTF..
>>> Let me go read that stuff :/
>> At the very least the one on collapse_and_free_pmd() could easily become
>> a call_rcu() based free.
>> I'm not sure I'm following what collapse_huge_page() does just yet.
> DavidH, what do you thikn about reviving Jann's patches here:
>    https://bugs.chromium.org/p/project-zero/issues/detail?id=2365#c1
> Those are far more invasive, but afaict they seem to do the right thing.

I recall seeing those while discussed on security at kernel.org. What we 
currently have was (IMHO for good reasons) deemed better to fix the 
issue, especially when caring about backports and getting it right.

The alternative that was discussed in that context IIRC was to simply 
allocate a fresh page table, place the fresh page table into the list 
instead, and simply free the old page table (then using common machinery).

TBH, I'd wish (and recently raised) that we could just stop wasting 
memory on page tables for THPs that are maybe never going to get 
PTE-mapped ... and eventually just allocate on demand (with some 
caching?) and handle the places where we're OOM and cannot PTE-map a THP 
in some descend way.

... instead of trying to figure out how to deal with these page tables 
we cannot free but have to special-case simply because of GUP-fast.


David / dhildenb

More information about the Linuxppc-dev mailing list