[PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

Frederic Barrat fbarrat at linux.vnet.ibm.com
Fri Aug 25 02:40:34 AEST 2017



Le 21/08/2017 à 19:35, Benjamin Herrenschmidt a écrit :
> On Mon, 2017-08-21 at 19:27 +0200, Frederic Barrat wrote:
>> Hi Ben,
>>
>> Le 24/07/2017 à 06:28, Benjamin Herrenschmidt a écrit :
>>> Instead of comparing the whole CPU mask every time, let's
>>> keep a counter of how many bits are set in the mask. Thus
>>> testing for a local mm only requires testing if that counter
>>> is 1 and the current CPU bit is set in the mask.
>>
>>
>> I'm trying to see if we could merge this patch with what I'm trying to
>> do to mark a context as requiring global TLBIs.
>> In http://patchwork.ozlabs.org/patch/796775/
>> I'm introducing a 'flags' per memory context, using one bit to say if
>> the context needs global TLBIs.
>> The 2 could co-exist, just checking... Do you think about using the
>> actual active_cpus count down the road, or is it just a matter of
>> knowing if there are more than one active cpus?
> 
> Or you could just incrementer my counter. Just make sure you increment
> it at most once per CXL context and decrement when the context is gone.

The decrementing part is giving me troubles, and I think it makes sense: 
if I decrement the counter when detaching the context from the capi 
card, then the next TLBIs for the memory context may be back to local. 
So when the process exits, the NPU wouldn't get the associated TLBIs, 
which spells trouble the next time the same memory context ID is reused. 
I believe this the cause of the problem I'm seeing. As soon as I keep 
the TLBIs global, even after I detach from the capi adapter, everything 
is fine.

Does it sound right?

So to keep the checks minimal in mm_is_thread_local(), to just checking 
the active_cpus count, I'm thinking of introducing a "copro enabled" bit 
on the context, so that we can increment active_cpus only once. And 
never decrement it.

   Fred



More information about the Linuxppc-dev mailing list