[PATCH v2 1/6] powerpc64/bpf: Move tail_call_cnt to bottom of stack frame

Hari Bathini hbathini at linux.ibm.com
Sat Jan 17 21:11:40 AEDT 2026



On 14/01/26 5:14 pm, adubey at linux.ibm.com wrote:
> From: Abhishek Dubey <adubey at linux.ibm.com>
> 
> In the conventional stack frame, the position of tail_call_cnt
> is after the NVR save area (BPF_PPC_STACK_SAVE). Whereas, the
> offset of tail_call_cnt in the trampoline frame is after the
> stack alignment padding. BPF JIT logic could become complex
> when dealing with frame-sensitive offset calculation of
> tail_call_cnt. Having the same offset in both frames is the
> desired objective.
> 
> The trampoline frame does not have a BPF_PPC_STACK_SAVE area.
> Introducing it leads to under-utilization of extra memory meant
> only for the offset alignment of tail_call_cnt.
> Another challenge is the variable alignment padding sitting at
> the bottom of the trampoline frame, which requires additional
> handling to compute tail_call_cnt offset.
> 
> This patch addresses the above issues by moving tail_call_cnt
> to the bottom of the stack frame at offset 0 for both types
> of frames. This saves additional bytes required by BPF_PPC_STACK_SAVE
> in trampoline frame, and a common offset computation for
> tail_call_cnt serves both frames.
> 
> The changes in this patch are required by the third patch in the
> series, where the 'reference to tail_call_info' of the main frame
> is copied into the trampoline frame from the previous frame.
> 
> Signed-off-by: Abhishek Dubey <adubey at linux.ibm.com>
> ---
>   arch/powerpc/net/bpf_jit.h        |  4 ++++
>   arch/powerpc/net/bpf_jit_comp64.c | 31 ++++++++++++++++++++-----------
>   2 files changed, 24 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h
> index 8334cd667bba..45d419c0ee73 100644
> --- a/arch/powerpc/net/bpf_jit.h
> +++ b/arch/powerpc/net/bpf_jit.h
> @@ -72,6 +72,10 @@
>   	} } while (0)
>   
>   #ifdef CONFIG_PPC64
> +
> +/* for tailcall counter */
> +#define BPF_PPC_TAILCALL        8
> +
>   /* If dummy pass (!image), account for maximum possible instructions */
>   #define PPC_LI64(d, i)		do {					      \
>   	if (!image)							      \
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> index 1fe37128c876..39061cd742c1 100644
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
> @@ -20,13 +20,15 @@
>   #include "bpf_jit.h"
>   
>   /*
> - * Stack layout:
> + * Stack layout 1:
> + * Layout when setting up our own stack frame.
> + * Note: r1 at bottom, component offsets positive wrt r1.
>    * Ensure the top half (upto local_tmp_var) stays consistent
>    * with our redzone usage.
>    *
>    *		[	prev sp		] <-------------
> - *		[   nv gpr save area	] 6*8		|
>    *		[    tail_call_cnt	] 8		|
> + *		[   nv gpr save area	] 6*8		|
>    *		[    local_tmp_var	] 24		|
>    * fp (r31) -->	[   ebpf stack space	] upto 512	|
>    *		[     frame header	] 32/112	|
> @@ -36,10 +38,12 @@
>   /* for gpr non volatile registers BPG_REG_6 to 10 */
>   #define BPF_PPC_STACK_SAVE	(6*8)
>   /* for bpf JIT code internal usage */
> -#define BPF_PPC_STACK_LOCALS	32
> +#define BPF_PPC_STACK_LOCALS	24
>   /* stack frame excluding BPF stack, ensure this is quadword aligned */
>   #define BPF_PPC_STACKFRAME	(STACK_FRAME_MIN_SIZE + \
> -				 BPF_PPC_STACK_LOCALS + BPF_PPC_STACK_SAVE)
> +				 BPF_PPC_STACK_LOCALS + \
> +				 BPF_PPC_STACK_SAVE   + \
> +				 BPF_PPC_TAILCALL)
>   
>   /* BPF register usage */
>   #define TMP_REG_1	(MAX_BPF_JIT_REG + 0)
> @@ -87,27 +91,32 @@ static inline bool bpf_has_stack_frame(struct codegen_context *ctx)
>   }
>   

>   /*
> + * Stack layout 2:
>    * When not setting up our own stackframe, the redzone (288 bytes) usage is:
> + * Note: r1 from prev frame. Component offset negative wrt r1.
>    *
>    *		[	prev sp		] <-------------
>    *		[	  ...       	] 		|
>    * sp (r1) --->	[    stack pointer	] --------------
> - *		[   nv gpr save area	] 6*8
>    *		[    tail_call_cnt	] 8
> + *		[   nv gpr save area	] 6*8
>    *		[    local_tmp_var	] 24
>    *		[   unused red zone	] 224
>    */

Calling it stack layout 1 & 2 is inappropriate. The stack layout
is essentially the same. It just goes to show things with reference
to r1 when stack is setup explicitly vs when redzone is being used...

- Hari



More information about the Linuxppc-dev mailing list