[RFC PATCH] powerpc/ftrace: Refactoring and support for -fpatchable-function-entry

Christophe Leroy christophe.leroy at csgroup.eu
Fri May 26 15:35:46 AEST 2023



Le 23/05/2023 à 11:31, Naveen N Rao a écrit :
> Christophe Leroy wrote:
>>
>> That's better, but still more time than original implementation:
>>
>> +20% to activate function tracer (was +40% with your RFC)
>> +21% to activate nop tracer (was +24% with your RFC)
>>
>> perf record (without strict kernel rwx) :
>>
>>      17.75%  echo     [kernel.kallsyms]   [k] ftrace_check_record
>>       9.76%  echo     [kernel.kallsyms]   [k] ftrace_replace_code
>>       6.53%  echo     [kernel.kallsyms]   [k] patch_instruction
>>       5.21%  echo     [kernel.kallsyms]   [k] __ftrace_hash_rec_update
>>       4.26%  echo     [kernel.kallsyms]   [k] ftrace_get_addr_curr
>>       4.18%  echo     [kernel.kallsyms]   [k] ftrace_get_call_inst.isra.0
>>       3.45%  echo     [kernel.kallsyms]   [k] ftrace_get_addr_new
>>       3.08%  echo     [kernel.kallsyms]   [k] function_trace_call
>>       2.20%  echo     [kernel.kallsyms]   [k] 
>> __rb_reserve_next.constprop.0
>>       2.05%  echo     [kernel.kallsyms]   [k] copy_page
>>       1.91%  echo     [kernel.kallsyms]   [k] 
>> ftrace_create_branch_inst.constprop.0
>>       1.83%  echo     [kernel.kallsyms]   [k] ftrace_rec_iter_next
>>       1.83%  echo     [kernel.kallsyms]   [k] rb_commit
>>       1.69%  echo     [kernel.kallsyms]   [k] ring_buffer_lock_reserve
>>       1.54%  echo     [kernel.kallsyms]   [k] trace_function
>>       1.39%  echo     [kernel.kallsyms]   [k] 
>> __call_rcu_common.constprop.0
>>       1.25%  echo     ld-2.23.so          [.] do_lookup_x
>>       1.17%  echo     [kernel.kallsyms]   [k] ftrace_rec_iter_record
>>       1.03%  echo     [kernel.kallsyms]   [k] unmap_page_range
>>       0.95%  echo     [kernel.kallsyms]   [k] flush_dcache_icache_page
>>       0.95%  echo     [kernel.kallsyms]   [k] ftrace_lookup_ip
> 
> Ok, I simplified this further, and this is as close to the previous fast 
> path as we can get (applies atop the original RFC). The only difference 
> left is the ftrace_rec iterator.

That's not better, that's even slightly worse (less than 1%).

I will try to investigate why.

Christophe


More information about the Linuxppc-dev mailing list