[PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
David Hildenbrand
david at redhat.com
Fri Sep 12 18:04:12 AEST 2025
>>
>> struct lazy_mmu_state {
>> uint8_t enabled_count;
>> bool paused;
>
> Looking at the arm64 implementation, I'm thinking: instead of the paused
> member, how about a PF_LAZY_MMU task flag? It would be set when lazy_mmu
> is actually enabled (i.e. inside an enter()/leave() section, and not
> inside a pause()/resume() section). This way, architectures could use
> that flag directly to tell if lazy_mmu is enabled instead of reinventing
> the wheel, all in slightly different ways. Namely:
>
> * arm64 uses a thread flag (TIF_LAZY_MMU) - this is trivially replaced
> with PF_LAZY_MMU
> * powerpc and sparc use batch->active where batch is a per-CPU variable;
> I expect this can also be replaced with PF_LAZY_MMU
> * x86/xen is more complex as it has xen_lazy_mode which tracks both
> LAZY_MMU and LAZY_CPU modes. I'd probably leave that one alone, unless a
> Xen expert is motivated to refactor it.
>
> With that approach, the implementation of arch_enter() and arch_leave()
> becomes very simple (no tracking of lazy_mmu status) on arm64, powerpc
> and sparc.
>
> (Of course we could also have an "enabled" member in lazy_mmu_state
> instead of PF_LAZY_MMU, there is no functional difference.)
>
No strong opinion, but to me it feels like PF_LAZY_MMU is rather "the
effective state when combining nested+paused", and might complicate the
code + sanity checks?
So we could maintain that in addition fairly easily of course from the
core instead of letting archs do that manually.
I would probably have to see the end result to judge whether removing
the "paused" bool makes things look more complicated or not.
>> }
>>
>> c) With that config, common-code lazy_mmu_*() functions implement the
>> updating of the lazy_mmu_state in task_struct and call into arch code
>> on the transition from 0->1, 1->0 etc.
>
> Indeed, this is how I thought about it. There is actually quite a lot
> that can be moved to the generic functions:
> * Updating lazy_mmu_state
> * Sanity checks on lazy_mmu_state (e.g. underflow/overflow)
> * Bailing out if in_interrupt() (not done consistently across arch's at
> the moment)
>
>>
>> Maybe that can be done through exiting
>> arch_enter_lazy_mmu_mode()/arch_leave_lazy_mmu_mode() callbacks, maybe
>> we need more. I feel like
>> we might be able to implement that through the existing helpers.
>
> We might want to rename them to align with the new generic helpers, but
> yes otherwise the principle should remain unchanged.
>
> In fact, we will also need to revive arch_flush_lazy_mmu_mode().
That's okay if it's all hidden behaind a sane core API.
> Indeed,
> in the nested situation, we need the following arch calls:
>
> enter() -> arch_enter()
> enter() -> [nothing]
> leave() -> arch_flush()
> leave() -> arch_leave()
>
> leave() must always flush whatever arch state was batched, as may be
> expected by the caller.
>
> How does all that sound?
I am no expert on the "always flush when leaving", but it sounds
reasonable to me.
Which arch operations would you call from
pause()
continue()
?
>> And on top of the proposal above we will have some
>>
>> struct arch_lazy_mmu_state;
>>
>> define by the architecture (could be an empty struct on most).
>>
>> We can store that inside "struct lazy_mmu_state;" or if we ever have
>> to, start returning only that from the enable/disable etc. functions.
>
> I'm not sure we'd want to mix those styles (task_struct member + local
> variable), that's adding complexity without much upside... Also having a
> local variable at every nesting level only makes sense if we have an
> arch callback regardless of nesting level, which is unnecessary in this
> proposed API.
Yes, that was rather a "if we ever really run out of space we could look
into that", I am not a fan of it obviously.
--
Cheers
David / dhildenb
More information about the Linuxppc-dev
mailing list