[RFC PATCH] powerpc/book3s64/radix: Upgrade va tlbie to PID tlbie if we cross PMD_SIZE

Puvichakravarthy Ramachandran puvichakravarthy at in.ibm.com
Fri Aug 6 15:22:32 AEST 2021


> With shared mapping, even though we are unmapping a large range, the 
kernel
> will force a TLB flush with ptl lock held to avoid the race mentioned in
> commit 1cf35d47712d ("mm: split 'tlb_flush_mmu()' into tlb flushing and 
memory freeing parts")
> This results in the kernel issuing a high number of TLB flushes even for 
a large
> range. This can be improved by making sure the kernel switch to pid 
based flush if the
> kernel is unmapping a 2M range.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
> ---
>  arch/powerpc/mm/book3s64/radix_tlb.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
> index aefc100d79a7..21d0f098e43b 100644
> --- a/arch/powerpc/mm/book3s64/radix_tlb.c
> +++ b/arch/powerpc/mm/book3s64/radix_tlb.c
> @@ -1106,7 +1106,7 @@ EXPORT_SYMBOL(radix__flush_tlb_kernel_range);
>   * invalidating a full PID, so it has a far lower threshold to change 
from
>   * individual page flushes to full-pid flushes.
>   */
> -static unsigned long tlb_single_page_flush_ceiling __read_mostly = 33;
> +static unsigned long tlb_single_page_flush_ceiling __read_mostly = 32;
>  static unsigned long tlb_local_single_page_flush_ceiling __read_mostly 
= POWER9_TLB_SETS_RADIX * 2;
> 
>  static inline void __radix__flush_tlb_range(struct mm_struct *mm,
> @@ -1133,7 +1133,7 @@ static inline void __radix__flush_tlb_range(struct 
mm_struct *mm,
>       if (fullmm)
>               flush_pid = true;
>       else if (type == FLUSH_TYPE_GLOBAL)
> -             flush_pid = nr_pages > tlb_single_page_flush_ceiling;
> +             flush_pid = nr_pages >= tlb_single_page_flush_ceiling;
>       else
>               flush_pid = nr_pages > 
tlb_local_single_page_flush_ceiling;

I evaluated the patches from Aneesh with a micro benchmark which does 
shmat, shmdt of 256 MB segment.
Higher the rate of work, better the performance. With a value of 32, we 
match the performance of 
GTSE=off. This was evaluated on SLES15 SP3 kernel.


# cat /sys/kernel/debug/powerpc/tlb_single_page_flush_ceiling 
32

# perf stat -I 1000 -a -e powerpc:tlbie,r30058 ./tlbie -i 5 -c 1 t 1
 Rate of work: = 311 
#           time             counts unit events
     1.013131404              50939      powerpc:tlbie   
     1.013131404              50658      r30058  
 Rate of work: = 318 
     2.026957019              51520      powerpc:tlbie   
     2.026957019              51481      r30058  
 Rate of work: = 318 
     3.038884431              51485      powerpc:tlbie   
     3.038884431              51461      r30058  
 Rate of work: = 318 
     4.051483926              51485      powerpc:tlbie   
     4.051483926              51520      r30058  
 Rate of work: = 318 
     5.063635713              48577      powerpc:tlbie   
     5.063635713              48347      r30058  
 
# echo 34 > /sys/kernel/debug/powerpc/tlb_single_page_flush_ceiling 

# perf stat -I 1000 -a -e powerpc:tlbie,r30058 ./tlbie -i 5 -c 1 t 1
 Rate of work: = 174 
#           time             counts unit events
     1.012672696             721471      powerpc:tlbie   
     1.012672696             726491      r30058  
 Rate of work: = 177 
     2.026348661             737460      powerpc:tlbie   
     2.026348661             736138      r30058  
 Rate of work: = 178 
     3.037932122             737460      powerpc:tlbie   
     3.037932122             737460      r30058  
 Rate of work: = 178 
     4.050198819             737044      powerpc:tlbie   
     4.050198819             737460      r30058  
 Rate of work: = 177 
     5.062400776             692832      powerpc:tlbie   
     5.062400776             688319      r30058          


Regards,
Puvichakravarthy Ramachandran





More information about the Linuxppc-dev mailing list