[PATCH v3 2/2] cxl: Enable global TLBIs for cxl contexts

Nicholas Piggin nicholas.piggin at gmail.com
Fri Sep 8 16:56:24 AEST 2017


On Sun,  3 Sep 2017 20:15:13 +0200
Frederic Barrat <fbarrat at linux.vnet.ibm.com> wrote:

> The PSL and nMMU need to see all TLB invalidations for the memory
> contexts used on the adapter. For the hash memory model, it is done by
> making all TLBIs global as soon as the cxl driver is in use. For
> radix, we need something similar, but we can refine and only convert
> to global the invalidations for contexts actually used by the device.
> 
> The new mm_context_add_copro() API increments the 'active_cpus' count
> for the contexts attached to the cxl adapter. As soon as there's more
> than 1 active cpu, the TLBIs for the context become global. Active cpu
> count must be decremented when detaching to restore locality if
> possible and to avoid overflowing the counter.
> 
> The hash memory model support is somewhat limited, as we can't
> decrement the active cpus count when mm_context_remove_copro() is
> called, because we can't flush the TLB for a mm on hash. So TLBIs
> remain global on hash.

Sorry I didn't look at this earlier and just wading in here a bit, but
what do you think of using mmu notifiers for invalidating nMMU and
coprocessor caches, rather than put the details into the host MMU
management? npu-dma.c already looks to have almost everything covered
with its notifiers (in that it wouldn't have to rely on tlbie coming
from host MMU code).

This change is not too bad today, but if we get to more complicated
MMU/nMMU TLB management like directed invalidation of particular units,
then putting more knowledge into the host code will end up being
complex I think.

I also want to also do optimizations on the core code that assumes we
only have to take care of other CPUs, e.g.,

https://patchwork.ozlabs.org/patch/811068/

Or, another example, directed IPI invalidations from the mm_cpumask
bitmap.

I realize you want to get something merged! For the merge window and
backports this seems fine. I think it would be nice soon afterwards to
get nMMU knowledge out of the core code... Though I also realize with
our tlbie instruction that does everything then it may be tricky to
make a really optimal notifier.

Thanks,
Nick

> 
> Signed-off-by: Frederic Barrat <fbarrat at linux.vnet.ibm.com>
> Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
> ---
> Changelog:
> v3: don't decrement active cpus count with hash, as we don't know how to flush
> v2: Replace flush_tlb_mm() by the new flush_all_mm() to flush the TLBs
> and PWCs (thanks to Ben)


More information about the Linuxppc-dev mailing list