powerpc-part: was: Re: [PATCH v6] livepatch: Clear relocation targets on a module removal

Joe Lawrence joe.lawrence at redhat.com
Wed Dec 14 09:19:22 AEDT 2022


On 12/13/22 8:29 AM, Petr Mladek wrote:
> On Tue 2022-12-13 00:13:46, Song Liu wrote:
>> )() ()On Mon, Dec 12, 2022 at 9:12 AM Petr Mladek <pmladek at suse.com> wrote:
>>>
>>> On Fri 2022-12-09 11:59:35, Song Liu wrote:
>>>> On Fri, Dec 9, 2022 at 3:41 AM Petr Mladek <pmladek at suse.com> wrote:
>>>>> On Mon 2022-11-28 17:57:06, Song Liu wrote:
>>>>>> On Fri, Nov 18, 2022 at 8:24 AM Petr Mladek <pmladek at suse.com> wrote:
>>>>>>>
>>>>>>>> --- a/arch/powerpc/kernel/module_64.c
>>>>>>>> +++ b/arch/powerpc/kernel/module_64.c
>>>>>>>> +#ifdef CONFIG_LIVEPATCH
>>>>>>>> +void clear_relocate_add(Elf64_Shdr *sechdrs,
>>>>>>>> +                    const char *strtab,
>>>>>>>> +                    unsigned int symindex,
>>>>>>>> +                    unsigned int relsec,
>>>>>>>> +                    struct module *me)
>>>>>>>> +{
>>>
>>> [...]
>>>
>>>>>>>> +
>>>>>>>> +             instruction = (u32 *)location;
>>>>>>>> +             if (is_mprofile_ftrace_call(symname))
>>>>>>>> +                     continue;
>>>>>
>>>>> Why do we ignore these symbols?
>>>>>
>>>>> I can't find any counter-part in apply_relocate_add(). It looks super
>>>>> tricky. It would deserve a comment.
>>>>>
>>>>> And I have no idea how we could maintain these exceptions.
>>>>>
>>>>>>>> +             if (!instr_is_relative_link_branch(ppc_inst(*instruction)))
>>>>>>>> +                     continue;
>>>>>
>>>>> Same here. It looks super tricky and there is no explanation.
>>>>
>>>> The two checks are from restore_r2(). But I cannot really remember
>>>> why we needed them. It is probably an updated version from an earlier
>>>> version (3 year earlier..).
>>>
>>> This is a good sign that it has to be explained in a comment.
>>> Or even better, it should not by copy pasted.
>>>
>>>>>>>> +             instruction += 1;
>>>>>>>> +             patch_instruction(instruction, ppc_inst(PPC_RAW_NOP()));
>>>
>>> I believe that this is not enough. apply_relocate_add() does this:
>>>
>>> int apply_relocate_add(Elf64_Shdr *sechdrs,
>>> [...]
>>>                        struct module *me)
>>> {
>>> [...]
>>>                 case R_PPC_REL24:
>>>                         /* FIXME: Handle weak symbols here --RR */
>>>                         if (sym->st_shndx == SHN_UNDEF ||
>>>                             sym->st_shndx == SHN_LIVEPATCH) {
>>> [...]
>>>                         if (!restore_r2(strtab + sym->st_name,
>>>                                                         (u32 *)location + 1, me))
>>> [...]                                   return -ENOEXEC;
>>>
>>> --->                    if (patch_instruction((u32 *)location, ppc_inst(value)))
>>>                                 return -EFAULT;
>>>
>>> , where restore_r2() does:
>>>
>>> static int restore_r2(const char *name, u32 *instruction, struct module *me)
>>> {
>>> [...]
>>>         /* ld r2,R2_STACK_OFFSET(r1) */
>>> --->    if (patch_instruction(instruction, ppc_inst(PPC_INST_LD_TOC)))
>>>                 return 0;
>>> [...]
>>> }
>>>
>>> By other words, apply_relocate_add() modifies two instructions:
>>>
>>>    + patch_instruction() called in restore_r2() writes into "location + 1"
>>>    + patch_instruction() called in apply_relocate_add() writes into "location"
>>>
>>> IMHO, we have to clear both.
>>>
>>> IMHO, we need to implement a function that reverts the changes done
>>> in restore_r2(). Also we need to revert the changes done in
>>> apply_relocate_add().
>>
>> I finally got time to read all the details again and recalled what
>> happened with the code.
>>
>> The failure happens when we
>> 1) call apply_relocate_add() on klp load (or module first load,
>>    if klp was loaded first);
>> 2) do nothing when the module is unloaded;
>> 3) call apply_relocate_add() on module reload, which failed.
>>
>> The failure happens at this check in restore_r2():
>>
>>         if (*instruction != PPC_RAW_NOP()) {
>>                 pr_err("%s: Expected nop after call, got %08x at %pS\n",
>>                         me->name, *instruction, instruction);
>>                 return 0;
>>         }
>>
>> Therefore, apply_relocate_add only fails when "location + 1"
>> is not NOP. And to make it not fail, we only need to write NOP to
>> "location + 1" in clear_relocate_add().
> 
> Yes, this should be enough to pass the existing check.
> 
>> IIUC, you want clear_relocate_add() to undo everything we did
>> in apply_relocate_add(); while I was writing clear_relocate_add()
>> to make the next apply_relocate_add() not fail.
>>
>> I agree that, based on the name, clear_relocate_add() should
>> undo everything by apply_relocate_add(). But I am not sure how
>> to handle some cases. For example, how do we undo
>>
>>                 case R_PPC64_ADDR32:
>>                         /* Simply set it */
>>                         *(u32 *)location = value;
>>                        break;
>>
>> Shall we just write zeros? I don't think this matters.
> 
> I guess that it would be zeros as we do in x86_64.
> 
> 
>> I think this is the question we should answer first:
>> What shall clear_relocate_add() do?
>> 1) undo everything by apply_relocate_add();
>> 2) only do things needed to make the next
>>    apply_relocate_add succeed;
>> 3) something between 1) and 2).
> 
> Good question.
> 
> Hmm, the commit a443bf6e8a7674b86221f49 ("powerpc/modules: Add REL24
> relocation support of livepatch symbols") suggests that all symbols
> in the section SHN_LIVEPATCH have the type R_PPC_REL24. AFAIK, the
> kernel livepatches are the only user of the clear_relocate_add()
> feature.
> 
> If the above is correct then it might be enough to clear only
> R_PPC_REL24 type. And it might be enough to warn when clear_relocate_add()
> is called for another type so that we know when the relocations
> were not cleared properly.
> 
> Good question.  We might need some input from people familiar
> with the architecture and creating the livepatches.
> 

Adding Russell to the to CC list as he worked some of recent ppc64le
livepatch klp-relocation threads [1] [2].

Maybe it would simpler to first organize a cleanup of the code, then add
the capability to undo the relocations?  According to [2] and the last
comment on [3], it sounded like the Power folks had a "full"(er)
solution in mind depending on our requirements.

Finally, I'll try to finish my v6.1 rebase of the klp-convert patchset
this week.  That includes a bunch of kselftests that generate all manner
of klp-relocation types and sections.  (More than I've ever seen out of
kpatch-build.)

[1] https://lore.kernel.org/linuxppc-dev/YX9UUBeudSUuJs01@redhat.com/
[2] https://lore.kernel.org/linuxppc-dev/YxAc87dTmclHGCUy@redhat.com/
[3] https://github.com/linuxppc/issues/issues/375

-- 
Joe



More information about the Linuxppc-dev mailing list