[RFC PATCH] powerpc/64/ftrace: mprofile-kernel patch out mflr

Naveen N. Rao naveen.n.rao at linux.ibm.com
Fri May 17 04:22:19 AEST 2019


Nicholas Piggin wrote:
> Naveen N. Rao's on May 14, 2019 6:32 pm:
>> Michael Ellerman wrote:
>>> "Naveen N. Rao" <naveen.n.rao at linux.ibm.com> writes:
>>>> Michael Ellerman wrote:
>>>>> Nicholas Piggin <npiggin at gmail.com> writes:
>>>>>> The new mprofile-kernel mcount sequence is
>>>>>>
>>>>>>   mflr	r0
>>>>>>   bl	_mcount
>>>>>>
>>>>>> Dynamic ftrace patches the branch instruction with a noop, but leaves
>>>>>> the mflr. mflr is executed by the branch unit that can only execute one
>>>>>> per cycle on POWER9 and shared with branches, so it would be nice to
>>>>>> avoid it where possible.
>>>>>>
>>>>>> This patch is a hacky proof of concept to nop out the mflr. Can we do
>>>>>> this or are there races or other issues with it?
>>>>> 
>>>>> There's a race, isn't there?
>>>>> 
>>>>> We have a function foo which currently has tracing disabled, so the mflr
>>>>> and bl are nop'ed out.
>>>>> 
>>>>>   CPU 0			CPU 1
>>>>>   ==================================
>>>>>   bl foo
>>>>>   nop (ie. not mflr)
>>>>>   -> interrupt
>>>>>   something else	enable tracing for foo
>>>>>   ...			patch mflr and branch
>>>>>   <- rfi
>>>>>   bl _mcount
>>>>> 
>>>>> So we end up in _mcount() but with r0 not populated.
>>>>
>>>> Good catch! Looks like we need to patch the mflr with a "b +8" similar 
>>>> to what we do in __ftrace_make_nop().
>>> 
>>> Would that actually make it any faster though? Nick?
>> 
>> Ok, how about doing this as a 2-step process?
>> 1. patch 'mflr r0' with a 'b +8'
>>    synchronize_rcu_tasks()
>> 2. convert 'b +8' to a 'nop'
> 
> Good idea. Well the mflr r0 is harmless, so you can leave that in.
> You just need to ensure it's not removed before the bl is. So nop
> the bl _mcount, then synchronize_rcu_tasks(), then nop the mflr?

The problem actually seems to be when we try to patch in the branch to 
_mcount(), rather than when we are patching in the nop instructions 
(i.e., the race is when we try to enable the function tracer, rather 
than while disabling it).

When we disable ftrace, we only need to ensure we patch out the branch 
to _mcount() before patching out the preceding 'mflr r0'. I don't think 
we need a synchronize_rcu_tasks() in that case.

While enabling ftrace, we will first need to patch the preceding 'mflr 
r0' (which would now be a 'nop') with 'b +8', then use 
synchronize_rcu_tasks() and finally patch in 'bl _mcount()' followed by 
'mflr r0'.

I think that's what you meant, just that my reference to patching 'mflr 
r0' with a 'b +8' should have called out that the mflr would have been 
nop'ed out.

- Naveen



More information about the Linuxppc-dev mailing list