[RFC PATCH 0/3] enable bpf_prog_pack allocator for powerpc

Christophe Leroy christophe.leroy at csgroup.eu
Thu Nov 17 17:59:09 AEDT 2022



Le 16/11/2022 à 18:01, Hari Bathini a écrit :
> 
> 
> On 16/11/22 12:14 am, Christophe Leroy wrote:
>>
>>
>> Le 14/11/2022 à 18:27, Christophe Leroy a écrit :
>>>
>>>
>>> Le 14/11/2022 à 15:47, Hari Bathini a écrit :
>>>> Hi Christophe,
>>>>
>>>> On 11/11/22 4:55 pm, Christophe Leroy wrote:
>>>>> Le 10/11/2022 à 19:43, Hari Bathini a écrit :
>>>>>> Most BPF programs are small, but they consume a page each. For 
>>>>>> systems
>>>>>> with busy traffic and many BPF programs, this may also add 
>>>>>> significant
>>>>>> pressure on instruction TLB. High iTLB pressure usually slows down 
>>>>>> the
>>>>>> whole system causing visible performance degradation for production
>>>>>> workloads.
>>>>>>
>>>>>> bpf_prog_pack, a customized allocator that packs multiple bpf 
>>>>>> programs
>>>>>> into preallocated memory chunks, was proposed [1] to address it. This
>>>>>> series extends this support on powerpc.
>>>>>>
>>>>>> Patches 1 & 2 add the arch specific functions needed to support this
>>>>>> feature. Patch 3 enables the support for powerpc. The last patch
>>>>>> ensures cleanup is handled racefully.
>>>>>>
>>>>
>>>>>> Tested the changes successfully on a PowerVM. patch_instruction(),
>>>>>> needed for bpf_arch_text_copy(), is failing for ppc32. Debugging it.
>>>>>> Posting the patches in the meanwhile for feedback on these changes.
>>>>>
>>>>> I did a quick test on ppc32, I don't get such a problem, only 
>>>>> something
>>>>> wrong in the dump print as traps intructions only are dumped, but
>>>>> tcpdump works as expected:
>>>>
>>>> Thanks for the quick test. Could you please share the config you used.
>>>> I am probably missing a few knobs in my conifg...
>>>>
>>>
>>
>> I also managed to test it on QEMU. The config is based on 
>> pmac32_defconfig.
> 
> I had the same config but hit this problem:
> 
>      # echo 1 > /proc/sys/net/core/bpf_jit_enable; modprobe test_bpf
>      test_bpf: #0 TAX
>      ------------[ cut here ]------------
>      WARNING: CPU: 0 PID: 96 at arch/powerpc/net/bpf_jit_comp.c:367 
> bpf_int_jit_compile+0x8a0/0x9f8

I get no such problem, on QEMU, and I checked the .config has:
CONFIG_STRICT_KERNEL_RWX=y
CONFIG_STRICT_MODULE_RWX=y

Boot successful.
/ # ifconfig eth0 10.0.2.15
e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
/ # tftp -g 10.0.2.2 -r test_bpf.ko
/ # echo 1 > /proc/sys/net/core/bpf_jit_enable
/ # insmod ./test_bpf.ko
test_bpf: #0 TAX jited:1 216 87 86 PASS
test_bpf: #1 TXA jited:1 57 27 27 PASS
test_bpf: #2 ADD_SUB_MUL_K jited:1 50 PASS
test_bpf: #3 DIV_MOD_KX jited:1 110 PASS
test_bpf: #4 AND_OR_LSH_K jited:1 67 26 PASS
test_bpf: #5 LD_IMM_0 jited:1 77 PASS
...

By the way, you can note that during the boot you get:

	This platform has HASH MMU, STRICT_MODULE_RWX won't work

See why in 0670010f3b10 ("powerpc/32s: Enable STRICT_MODULE_RWX for the 
603 core")

Nevertheless it should prevent patch_instruction() to work.

Could you had a pr_err() in __patch_instruction() in the failure path to 
print and check exec_addr and patch_addr ?



>      jited:1
>      kernel tried to execute exec-protected page (be857020) - exploit 
> attempt? (uid: 0)
>      BUG: Unable to handle kernel instruction fetch
>      Faulting instruction address: 0xbe857020

I'm a bit surprised of this. On hash based book3s/32 there is no way to 
protect pages for exec-protection. Protection is performed at segment 
level, all kernel segments have the NX bit set except the segment used 
for module text, which is by default 0xb0000000-0xbfffffff.

Or maybe this is the first time that address is accessed, and the ISI 
handler does the check before loading the hash table ?

> 
> bpf_jit_binary_pack_finalize() function failed due to 
> patch_instruction() ..

Is there a way tell BPF core that jit failed in that case to avoid that ?

Christophe


More information about the Linuxppc-dev mailing list