[PATCH 5/6] powerpc/mm: Optimize detection of thread local mm's

Tue Jul 25 22:00:59 AEST 2017

Nicholas Piggin <npiggin at gmail.com> writes:

> On Mon, 24 Jul 2017 23:46:44 +1000
> Michael Ellerman <mpe at ellerman.id.au> wrote:
>
>> Nicholas Piggin <npiggin at gmail.com> writes:
>> 
>> > On Mon, 24 Jul 2017 14:28:02 +1000
>> > Benjamin Herrenschmidt <benh at kernel.crashing.org> wrote:
>> >  
>> >> Instead of comparing the whole CPU mask every time, let's
>> >> keep a counter of how many bits are set in the mask. Thus
>> >> testing for a local mm only requires testing if that counter
>> >> is 1 and the current CPU bit is set in the mask.  
>> ...
>> >
>> > Also does it make sense to define it based on NR_CPUS > BITS_PER_LONG?
>> > If it's <= then it should be similar load and compare, no?  
>> 
>> Do we make a machine with that few CPUs? ;)
>> 
>> I don't think it's worth special casing, all the distros run with much
>> much larger NR_CPUs than that.
>
> Not further special-casing, but just casing it based on NR_CPUS
> rather than BOOK3S.

The problem is the mm_context_t is defined based on BookE vs BookS etc.
not based on NR_CPUS.

So we'd have to add the atomic_t to all mm_context_t's, but #ifdef'ed
based on NR_CPUS.

But then some platforms don't support SMP, so it's a waste there. The
existing cpumask check compiles to ~= nothing on UP.

cheers