[PATCH v3 1/2] powerpc64/bpf: Implement JIT support for private stack
bot+bpf-ci at kernel.org
bot+bpf-ci at kernel.org
Thu Feb 26 07:35:12 AEDT 2026
> diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
> --- a/arch/powerpc/net/bpf_jit_comp.c
> +++ b/arch/powerpc/net/bpf_jit_comp.c
[ ... ]
> @@ -209,6 +264,9 @@
> cgctx.is_subprog = bpf_is_subprog(fp);
> cgctx.exception_boundary = fp->aux->exception_boundary;
> cgctx.exception_cb = fp->aux->exception_cb;
> + cgctx.priv_sp = priv_stack_ptr;
> + if (priv_stack_ptr)
> + cgctx.stack_size = 0;
This zeros ctx->stack_size before the JIT passes, which is correct
for the kernel stack frame allocation (skipping BPF stack space on
the kernel stack).
However, emit_fp_priv_stack() also reads ctx->stack_size to compute
the BPF frame pointer offset within the private stack allocation:
> diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c
> --- a/arch/powerpc/net/bpf_jit_comp64.c
> +++ b/arch/powerpc/net/bpf_jit_comp64.c
[ ... ]
> +static void emit_fp_priv_stack(u32 *image, struct codegen_context *ctx)
> +{
> + /* Load percpu data offset */
> + EMIT(PPC_RAW_LD(bpf_to_ppc(TMP_REG_1), _R13,
> + offsetof(struct paca_struct, data_offset)));
> + PPC_LI64(bpf_to_ppc(BPF_REG_FP), (__force long)ctx->priv_sp);
> + /*
> + * Load base percpu pointer of private stack allocation.
> + * Runtime per-cpu address = (base + data_offset) + (guard + stack_size)
> + */
> + EMIT(PPC_RAW_ADD(bpf_to_ppc(BPF_REG_FP),
> + bpf_to_ppc(TMP_REG_1), bpf_to_ppc(BPF_REG_FP)));
> + EMIT(PPC_RAW_ADDI(bpf_to_ppc(BPF_REG_FP), bpf_to_ppc(BPF_REG_FP),
> + PRIV_STACK_GUARD_SZ + round_up(ctx->stack_size, 16)));
> +}
Since ctx->stack_size was set to 0 earlier in bpf_int_jit_compile(),
the ADDI immediate here becomes PRIV_STACK_GUARD_SZ + 0 = 16.
This places BPF_REG_FP at the bottom of the usable stack area
(just past the overflow guard), rather than at the top.
The commit message describes the intended layout:
BPF frame pointer = Priv stack allocation start address +
Overflow guard +
Actual stack size defined by verifier
But the "Actual stack size defined by verifier" term is lost because
ctx->stack_size is already zero at this point.
BPF programs access the stack at FP-8, FP-16, etc. With FP at
offset 16 from the allocation base, FP-8 lands inside the overflow
guard and larger offsets go below the allocation entirely.
For comparison, x86 in do_jit() pre-computes the offset before
zeroing:
priv_frame_ptr = priv_stack_ptr + PRIV_STACK_GUARD_SZ +
round_up(stack_depth, 8);
stack_depth = 0;
And arm64 in build_prologue() does not zero ctx->stack_size at all,
instead using it directly:
priv_stack_ptr = prog->aux->priv_stack_ptr + PRIV_STACK_GUARD_SZ;
emit_percpu_ptr(priv_sp, priv_stack_ptr, ctx);
emit(A64_ADD_I(1, fp, priv_sp, ctx->stack_size), ctx);
The v1 of this patch also had it correct, pre-computing in
bpf_jit_build_prologue():
priv_frame_ptr = (void*) ctx->priv_sp + PRIV_STACK_GUARD_SZ +
round_up(ctx->stack_size, 16);
It appears the "Minor refactoring" in v1->v2 moved the
cgctx.stack_size = 0 earlier in the pipeline and removed the
pre-computation step, but did not account for emit_fp_priv_stack()
needing the original stack depth.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/22413899411
AI-authorship-score: low
AI-authorship-explanation: Code follows established powerpc JIT patterns with architecture-specific constructs; commit history references specific CI-bot fixes consistent with human iterative development.
issues-found: 1
issue-severity-score: high
issue-severity-explanation: BPF frame pointer is miscalculated due to using zeroed ctx->stack_size in emit_fp_priv_stack(), causing all BPF stack accesses to write outside the allocated private stack region, resulting in memory corruption.
More information about the Linuxppc-dev
mailing list