[PATCH v5 17/17] powerpc64/bpf: Add support for bpf trampolines

Hari Bathini hbathini at linux.ibm.com
Thu Oct 10 20:46:15 AEDT 2024



On 10/10/24 3:09 pm, Hari Bathini wrote:
> 
> 
> On 10/10/24 5:48 am, Michael Ellerman wrote:
>> Alexei Starovoitov <alexei.starovoitov at gmail.com> writes:
>>> On Tue, Oct 1, 2024 at 12:18 AM Hari Bathini <hbathini at linux.ibm.com> 
>>> wrote:
>>>> On 30/09/24 6:25 pm, Alexei Starovoitov wrote:
>>>>> On Sun, Sep 29, 2024 at 10:33 PM Hari Bathini 
>>>>> <hbathini at linux.ibm.com> wrote:
>>>>>> On 17/09/24 1:20 pm, Alexei Starovoitov wrote:
>>>>>>> On Sun, Sep 15, 2024 at 10:58 PM Hari Bathini 
>>>>>>> <hbathini at linux.ibm.com> wrote:
>>>>>>>>
>>>>>>>> +
>>>>>>>> +       /*
>>>>>>>> +        * Generated stack layout:
>>>>>>>> +        *
>>>>>>>> +        * func prev back chain         [ back chain        ]
>>>>>>>> +        *                              [                   ]
>>>>>>>> +        * bpf prog redzone/tailcallcnt [ ...               ] 64 
>>>>>>>> bytes (64-bit powerpc)
>>>>>>>> +        *                              [                   ] --
>>>>>>> ...
>>>>>>>> +
>>>>>>>> +       /* Dummy frame size for proper unwind - includes 64- 
>>>>>>>> bytes red zone for 64-bit powerpc */
>>>>>>>> +       bpf_dummy_frame_size = STACK_FRAME_MIN_SIZE + 64;
>>>>>>>
>>>>>>> What is the goal of such a large "red zone" ?
>>>>>>> The kernel stack is a limited resource.
>>>>>>> Why reserve 64 bytes ?
>>>>>>> tail call cnt can probably be optional as well.
>>>>>>
>>>>>> Hi Alexei, thanks for reviewing.
>>>>>> FWIW, the redzone on ppc64 is 288 bytes. BPF JIT for ppc64 was using
>>>>>> a redzone of 80 bytes since tailcall support was introduced [1].
>>>>>> It came down to 64 bytes thanks to [2]. The red zone is being used
>>>>>> to save NVRs and tail call count when a stack is not setup. I do
>>>>>> agree that we should look at optimizing it further. Do you think
>>>>>> the optimization should go as part of PPC64 trampoline enablement
>>>>>> being done here or should that be taken up as a separate item, maybe?
>>>>>
>>>>> The follow up is fine.
>>>>> It just odd to me that we currently have:
>>>>>
>>>>> [   unused red zone ] 208 bytes protected
>>>>>
>>>>> I simply don't understand why we need to waste this much stack space.
>>>>> Why can't it be zero today ?
>>>>
>>>> The ABI for ppc64 has a redzone of 288 bytes below the current
>>>> stack pointer that can be used as a scratch area until a new
>>>> stack frame is created. So, no wastage of stack space as such.
>>>> It is just red zone that can be used before a new stack frame
>>>> is created. The comment there is only to show how redzone is
>>>> being used in ppc64 BPF JIT. I think the confusion is with the
>>>> mention of "208 bytes" as protected. As not all of that scratch
>>>> area is used, it mentions the remaining as unused. Essentially
>>>> 288 bytes below current stack pointer is protected from debuggers
>>>> and interrupt code (red zone). Note that it should be 224 bytes
>>>> of unused red zone instead of 208 bytes as red zone usage in
>>>> ppc64 BPF JIT come down from 80 bytes to 64 bytes since [2].
>>>> Hope that clears the misunderstanding..
>>>
>>> I see. That makes sense. So it's similar to amd64 red zone,
>>> but there we have an issue with irqs, hence the kernel is
>>> compiled with -mno-red-zone.
>>
>> I assume that issue is that the interrupt entry unconditionally writes
>> some data below the stack pointer, disregarding the red zone?
>>
>>> I guess ppc always has a different interrupt stack and
>>> it's not an issue?
>>
>> No, the interrupt entry allocates a frame that is big enough to cover
>> the red zone as well as the space it needs to save registers.
>>
>> See STACK_INT_FRAME_SIZE which includes KERNEL_REDZONE_SIZE:
>>
>>    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ 
>> tree/arch/powerpc/include/asm/ptrace.h? 
>> commit=8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b#n165
>>
>> Which is renamed to INT_FRAME_SIZE in asm-offsets.c and then is used in
>> the interrupt entry here:
>>
>>    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ 
>> tree/arch/powerpc/kernel/exceptions-64s.S? 
>> commit=8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b#n497
> 
> Thanks for clarifying that, Michael.
> Only async interrupt handlers use different interrupt stacks, right?

... and separate emergency stack for some special cases...

Thanks
Hari


More information about the Linuxppc-dev mailing list