[PATCH v2 1/6] powerpc/code-patching: Implement generic text patching function

Tue Sep 27 15:54:57 AEST 2022

Le 27/09/2022 à 04:57, Benjamin Gray a écrit :
> On Mon, 2022-09-26 at 14:33 +0000, Christophe Leroy wrote:
>>> +#define patch_memory(addr, val) \
>>> +({ \
>>> +       BUILD_BUG_ON(!__native_word(val)); \
>>> +       __patch_memory(addr, (unsigned long) val, sizeof(val)); \
>>> +})
>>
>> Can you do a static __always_inline function instead of a macro here
>> ?
> 
> I didn't before because it doesn't allow using the type as a parameter.
> I considered these forms
> 
>    patch_memory(addr, val, 8);
>    patch_memory(addr, val, void*);
>    patch_memory(addr, val);  // size taken from val type
> 
> And thought the third was the nicest to use. Though coming back to
> this, I hadn't considered
> 
>    patch_memory(addr, val, sizeof(void*))
> 
> which would still allow a type to decide the size, and not be a macro.
> I've got an example implementation further down that also addresses the
> size check issue.

Oh, I missed that you did automatic type sizing. Fair enough.

However I think taking the type of the passed value is dangerous.

See put_user(), it uses the size of the destination pointer, not the 
size of the input value.

patch_memory doesn't seem to be used outside of code-patching.c, so I 
don't thing it is worth to worry about a nice looking API. Just make it 
simple and pass the size to the function.

> 
>>> +static int __always_inline ___patch_memory(void *patch_addr,
>>> +                                          unsigned long data,
>>> +                                          void *prog_addr,
>>> +                                          size_t size)
>>
>> Is it really needed in the .c file ? I would expect GCC to take the
>> right decision by itself.
> 
> I thought it'd be better to always inline it given it's only used
> generically in do_patch_memory and __do_patch_memory, which both get
> inlined into __patch_memory. But it does end up generating two copies
> due to the different contexts it's called in, so probably not worth it.
> Removed for v3.
> 
> (raw_patch_instruction gets an optimised inline of ___patch_memory
> either way)
> 
>> A BUILD_BUG() would be better here I think.
> 
> BUILD_BUG() as the default case always triggers though, I assume
> because the constant used for size is too far away. How about
> 
>    static __always_inline int patch_memory(void *addr,
>                                            unsigned long val,
>                                            size_t size)
>    {
>        int __patch_memory(void *dest, unsigned long src, size_t size);
> 
>        BUILD_BUG_ON_MSG(!(size == sizeof(char)  ||
>                           size == sizeof(short) ||
>                           size == sizeof(int)   ||
>                           size == sizeof(long)),
>                         "Unsupported size for patch_memory");
>        return __patch_memory(addr, val, size);
>    }
> 
> Declaring the __patch_memory function inside of patch_memory enforces
> that you can't accidentally call __patch_memory without going through
> this or the *patch_instruction entry points (which hardcode the size).

Aren't you making it more difficult that needed ? That's C, not C plus 
plus and we are not trying to help the user.
All kernel developpers know that as soon as they use a function that has 
a leading double underscore they will be on their own.

And again, patch_memory() isn't used anywhere else, at least for the 
time being, so why worry about that ?

> 
>>> +       }
>>>
>>> -               __put_kernel_nofault(patch_addr, &val, u32,
>>> failed);
>>> -       } else {
>>> -               u64 val = ppc_inst_as_ulong(instr);
>>> +       dcbst(patch_addr);
>>> +       dcbst(patch_addr + size - 1); /* Last byte of data may
>>> cross a cacheline */
>>
>> Or the second byte of data may cross a cacheline ...
> 
> It might, but unless we are assuming data cachelines smaller than the
> native word size it will either be in the first or last byte's
> cacheline. Whereas the last byte might be in it's own cacheline.
> 
> As justification the comment's misleading though, how about reducing it
> to "data may cross a cacheline" and leaving the reason for flushing the
> last byte implicit?

Yes that was my worry, a misleading comment.
I think "data may cross a cacheline" is what we need as a comment.

> 
>>> -static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
>>> +static int __always_inline __do_patch_memory(void *dest, unsigned
>>> long src, size_t size)
>>>    {
>>
>> Whaou, do we really want all this to be __always_inline ? Did you
>> check
>> the text size increase ?
> 
> These ones are redundant because GCC will already inline them, they
> were just part of experimenting inlining ___patch_memory. Will remove
> for v3.
> 
> The text size doesn't increase though because the call hierarchy is
> just a linear chain of
> __patch_memory -> do_patch_memory -> __do_patch_memory

Yes, I had in mind that all those would be inlined doing to all callers 
of patch_instruction() and patch_memory(), but of course it stays in 
code_patching.c so that's not a problem.

> 
> The entry point __patch_memory is not inlined.