[RFC PATCH v2 3/5] powerpc/ftrace: Unify 32-bit and 64-bit ftrace entry code

Naveen N Rao naveen at kernel.org
Wed Jun 12 00:47:19 AEST 2024


On Mon, Jun 10, 2024 at 04:06:32PM GMT, Steven Rostedt wrote:
> On Mon, 10 Jun 2024 14:08:16 +0530
> Naveen N Rao <naveen at kernel.org> wrote:
> 
> > On 32-bit powerpc, gcc generates a three instruction sequence for
> > function profiling:
> > 	mflr	r0
> > 	stw	r0, 4(r1)
> > 	bl	_mcount
> > 
> > On kernel boot, the call to _mcount() is nop-ed out, to be patched back
> > in when ftrace is actually enabled. The 'stw' instruction therefore is
> > not necessary unless ftrace is enabled. Nop it out during ftrace init.
> > 
> > When ftrace is enabled, we want the 'stw' so that stack unwinding works
> > properly. Perform the same within the ftrace handler, similar to 64-bit
> > powerpc.
> > 
> > For 64-bit powerpc, early versions of gcc used to emit a three
> > instruction sequence for function profiling (with -mprofile-kernel) with
> > a 'std' instruction to mimic the 'stw' above. Address that scenario also
> > by nop-ing out the 'std' instruction during ftrace init.
> > 
> > Signed-off-by: Naveen N Rao <naveen at kernel.org>
> 
> Isn't there still the race that there's a preemption between the:
> 
> 	stw	r0, 4(r1)
> and
> 	bl	_mcount
> 
> And if this breaks stack unwinding, couldn't this cause an issue for live
> kernel patching?
> 
> I know it's very unlikely, but in theory, I think the race exists.

I *think* you are assuming that we will be patching back the 'stw' 
instruction here? So, there could be an issue if a cpu has executed the 
nop instead of 'stw' and then sees the call to _mcount().

But, we don't patch back the 'stw' instruction. That is instead done as 
part of ftrace_caller(), along with setting up an additional stack frame 
to ensure reliable stack unwinding. Commit 41a506ef71eb 
("powerpc/ftrace: Create a dummy stackframe to fix stack unwind") has 
more details.

The primary motivation for this patch is to address differences in the 
function profile sequence with various toolchains. Since commit 
0f71dcfb4aef ("powerpc/ftrace: Add support for 
-fpatchable-function-entry"), we use the same two-instruction profile 
sequence across 32-bit and 64-bit powerpc:
	mflr	r0
	bl	ftrace_caller

This has also been true on 64-bit powerpc with -mprofile-kernel, except 
the very early versions of gcc that supported that option (gcc v5).

On 32-bit powerpc, we used to use the three instruction sequence before 
support for -fpatchable-function-entry was introduced.

In this patch, we move all toolchain variants to use the two-instruction 
sequence for consistency.


Thanks,
Naveen


More information about the Linuxppc-dev mailing list