[PATCH RFC] powerpc: Panic on jump label code patching failure
Christophe Leroy
christophe.leroy at csgroup.eu
Sat Sep 6 16:42:07 AEST 2025
Le 06/09/2025 à 05:52, Ritesh Harjani a écrit :
> [Vous ne recevez pas souvent de courriers de riteshh at linux.ibm.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
>
> Andrew Donnellan <ajd at linux.ibm.com> writes:
>
>> If patch_branch() or patch_instruction() fails while updating a jump
>> label, we presently fail silently, leading to unpredictable behaviour
>> later on.
>>
>> Change arch_jump_label_transform() to panic on a code patching failure,
>> matching the existing behaviour of arch_static_call_transform().
>>
>> Reported-by: Erhard Furtner <erhard_f at mailbox.org>
>> Signed-off-by: Andrew Donnellan <ajd at linux.ibm.com>
>>
>> ---
>>
>> Ran into this while debugging an issue that Erhard reported to me about my
>> PAGE_TABLE_CHECK series on a G4, where updating a static key failed
>> silently, but only for one call site, leading to an incorrect reference
>> count later on. This looks to be due to the issue fixed in [0]. A loud
>> failure would have saved us all considerable debugging time.
>>
>> Should I change the return type of arch_jump_label_transform() and handle
>> this in an arch-independent way? Are there other users of code patching
>> in powerpc that ought to be hardened?
>>
>> Or is this excessive?
>>
>> [0] https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4b5e6eb281d7b1ea77619bee17095f905a125168.1757003584.git.christophe.leroy@csgroup.eu/
>> ---
>> arch/powerpc/kernel/jump_label.c | 9 +++++++--
>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/jump_label.c b/arch/powerpc/kernel/jump_label.c
>> index 2659e1ac8604..80d41ed7ac50 100644
>> --- a/arch/powerpc/kernel/jump_label.c
>> +++ b/arch/powerpc/kernel/jump_label.c
>> @@ -12,9 +12,14 @@ void arch_jump_label_transform(struct jump_entry *entry,
>> enum jump_label_type type)
>> {
>> u32 *addr = (u32 *)jump_entry_code(entry);
>> + int err;
>>
>> if (type == JUMP_LABEL_JMP)
>> - patch_branch(addr, jump_entry_target(entry), 0);
>> + err = patch_branch(addr, jump_entry_target(entry), 0);
>> else
>> - patch_instruction(addr, ppc_inst(PPC_RAW_NOP()));
>> + err = patch_instruction(addr, ppc_inst(PPC_RAW_NOP()));
>> +
>> + if (err)
>> + panic("%s: patching failed, err %d, type %d, addr %pS, target %pS\n",
>> + __func__, err, type, addr, (void *)jump_entry_target(entry));
>> }
>
> arch_jump_label_transform() is mainly getting called from
> __jump_level_update() and it's used for enabling or updating static keys / branch.
>
> But static keys can also be used by drivers / module subsystem whose
> initialization happens late. Although I understand that if the above
> fails, it might fail much before, from the arch setup code itself, but
> panic() still feels like a big hammer.
But not being able to patch the kernel as required means that you get a
kernel behaving differently from what is expected.
Imagine a kernel running on a board that is controlling a saw. There is
a patch_instruction() to activate the safety feature which detects when
your hands are too close to the blade. Do you want the kernel to
continue running seamlessly when that patch_instruction() fails ? I'm
sure you don't !
>
> Would pr_err() print with WARN_ON_ONCE(1) would suffice in case of an
> err?
No, that's not enough, you can't rely on a kernel that will no behave as
expected.
>
> Also you said you ran into a problem at just one call site where above
> was silently failing. With the above change are you able to hit the
> panic() now? Because from what I see in patch_instruction(), it mainly
> will boil down to calling __patch_mem() which always returns 0.
As far as I can see, __patch_mem() returns -EPERM when
__put_kernel_nofault() fails:
static int __patch_mem(void *exec_addr, unsigned long val, void
*patch_addr, bool is_dword)
{
if (!IS_ENABLED(CONFIG_PPC64) || likely(!is_dword)) {
/* For big endian correctness: plain address would use the wrong half */
u32 val32 = val;
__put_kernel_nofault(patch_addr, &val32, u32, failed);
} else {
__put_kernel_nofault(patch_addr, &val, u64, failed);
}
asm ("dcbst 0, %0; sync; icbi 0,%1; sync; isync" :: "r" (patch_addr),
"r" (exec_addr));
return 0;
failed:
mb(); /* sync */
return -EPERM;
}
> Although there are other places where there can be an error returned,
> so I was wondering if that is what you were hitting or something else?
Andrew was hitting the -EPERM because the memory area was read-only.
Christophe
More information about the Linuxppc-dev
mailing list