[PATCH 1/4] powerpc/bpf: Move out dummy_tramp_addr after Long branch stub

Hari Bathini hbathini at linux.ibm.com
Mon Mar 16 05:06:06 AEDT 2026



On 25/02/26 7:06 am, adubey at linux.ibm.com wrote:
> From: Abhishek Dubey <adubey at linux.ibm.com>
> 
> Move the long branch address space to the bottom of the long
> branch stub. This allows uninterrupted disassembly until the
> last 8 bytes. Exclude these last bytes from the overall
> program length to prevent failure in assembly generation.
> Also, align dummy_tramp_addr field with 8-byte boundary.
> 
> Following is disassembler output for test program with moved down
> dummy_tramp_addr field:
> .....
> .....
> pc:68    left:44     a6 03 08 7c  :  mtlr 0
> pc:72    left:40     bc ff ff 4b  :  b .-68
> pc:76    left:36     a6 02 68 7d  :  mflr 11
> pc:80    left:32     05 00 9f 42  :  bcl 20, 31, .+4
> pc:84    left:28     a6 02 88 7d  :  mflr 12
> pc:88    left:24     14 00 8c e9  :  ld 12, 20(12)
> pc:92    left:20     a6 03 89 7d  :  mtctr 12
> pc:96    left:16     a6 03 68 7d  :  mtlr 11
> pc:100   left:12     20 04 80 4e  :  bctr
> pc:104   left:8      c0 34 1d 00  :
> 
> Failure log:
> Can't disasm instruction at offset 104: c0 34 1d 00 00 00 00 c0
> Disassembly logic can truncate at 104, ignoring last 8 bytes.
> 
> Update the dummy_tramp_addr field offset calculation from the end
> of the program to reflect its new location, for bpf_arch_text_poke()
> to update the actual trampoline's address in this field.
> 
> All BPF trampoline selftests continue to pass with this patch applied.
> 
> Signed-off-by: Abhishek Dubey <adubey at linux.ibm.com>
> ---
>   arch/powerpc/net/bpf_jit_comp.c | 45 +++++++++++++++++++++++++--------
>   1 file changed, 34 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> index 7a78e03d482f..f8f6305b0d9f 100644
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
> @@ -51,7 +51,9 @@ asm (
>   
>   void bpf_jit_build_fentry_stubs(u32 *image, struct codegen_context *ctx)
>   {
> -	int ool_stub_idx, long_branch_stub_idx;
> +	int ool_stub_idx, long_branch_stub_idx, tramp_load_offset;
> +	bool tramp_needs_align;
> +	u32 tramp_idx;
>   
>   	/*
>   	 * Out-of-line stub:
> @@ -70,27 +72,45 @@ void bpf_jit_build_fentry_stubs(u32 *image, struct codegen_context *ctx)
>   
>   	/*
>   	 * Long branch stub:
> -	 *	.long	<dummy_tramp_addr>
>   	 *	mflr	r11
>   	 *	bcl	20,31,$+4
> -	 *	mflr	r12
> -	 *	ld	r12, -8-SZL(r12)
> +	 *	mflr	r12		// lr/r12 stores current pc
> +	 *	ld	r12, 20(r12)	// offset(dummy_tramp_addr) from prev inst. is 20
>   	 *	mtctr	r12
> -	 *	mtlr	r11 // needed to retain ftrace ABI
> +	 *	mtlr	r11		// needed to retain ftrace ABI
>   	 *	bctr
> +	 *	nop			// Optional, for mem alignment of dummy_tramp_addr
> +	 *	.long	<dummy_tramp_addr>
>   	 */
> -	if (image)
> -		*((unsigned long *)&image[ctx->idx]) = (unsigned long)dummy_tramp;
> -	ctx->idx += SZL / 4;
>   	long_branch_stub_idx = ctx->idx;
>   	EMIT(PPC_RAW_MFLR(_R11));
>   	EMIT(PPC_RAW_BCL4());
>   	EMIT(PPC_RAW_MFLR(_R12));
> -	EMIT(PPC_RAW_LL(_R12, _R12, -8-SZL));
> +
> +	/* Relative offset of dummy_tramp_addr wrt start of long branch stub */
> +	tramp_idx = long_branch_stub_idx + 7;
> +	/*
> +	 * Image layout need not be considered 8-byte aligned.
> +	 * Lower 3 bits must be clear for 8-bytes alignment.
> +	 * Adjust offset for padding NOP before dummy_tramp_addr
> +	 */

> +	tramp_needs_align = (((unsigned long)&image[tramp_idx]) & 7) != 0;

I would rather check:

   is_8byte_aligned = (((unsigned long)&image[tramp_idx]) & 0x7) == 0x4;

and handle alignment when !is_8byte_aligned for better readability.

This alignment handling needs to go under CONFIG_PPC64.

Also, this alignment handling fix has nothing to with the moving around
of dummy_tramp_addr. Have the alignment handled in a separate patch with
a fixes tag for stable releases to pick it...

- Hari


More information about the Linuxppc-dev mailing list