[PATCH v2 1/6] powerpc/code-patching: Implement generic text patching function

Tue Sep 27 12:57:26 AEST 2022

On Mon, 2022-09-26 at 14:33 +0000, Christophe Leroy wrote:
> > +#define patch_memory(addr, val) \
> > +({ \
> > +       BUILD_BUG_ON(!__native_word(val)); \
> > +       __patch_memory(addr, (unsigned long) val, sizeof(val)); \
> > +})
> 
> Can you do a static __always_inline function instead of a macro here
> ?

I didn't before because it doesn't allow using the type as a parameter.
I considered these forms

  patch_memory(addr, val, 8);
  patch_memory(addr, val, void*);
  patch_memory(addr, val);  // size taken from val type

And thought the third was the nicest to use. Though coming back to
this, I hadn't considered

  patch_memory(addr, val, sizeof(void*))

which would still allow a type to decide the size, and not be a macro.
I've got an example implementation further down that also addresses the
size check issue.

> > +static int __always_inline ___patch_memory(void *patch_addr,
> > +                                          unsigned long data,
> > +                                          void *prog_addr,
> > +                                          size_t size)
> 
> Is it really needed in the .c file ? I would expect GCC to take the 
> right decision by itself.

I thought it'd be better to always inline it given it's only used
generically in do_patch_memory and __do_patch_memory, which both get
inlined into __patch_memory. But it does end up generating two copies
due to the different contexts it's called in, so probably not worth it.
Removed for v3.

(raw_patch_instruction gets an optimised inline of ___patch_memory
either way)

> A BUILD_BUG() would be better here I think.

BUILD_BUG() as the default case always triggers though, I assume
because the constant used for size is too far away. How about

  static __always_inline int patch_memory(void *addr, 
                                          unsigned long val, 
                                          size_t size) 
  {
      int __patch_memory(void *dest, unsigned long src, size_t size);

      BUILD_BUG_ON_MSG(!(size == sizeof(char)  ||
                         size == sizeof(short) ||
                         size == sizeof(int)   ||
                         size == sizeof(long)),
                       "Unsupported size for patch_memory");
      return __patch_memory(addr, val, size);
  }

Declaring the __patch_memory function inside of patch_memory enforces
that you can't accidentally call __patch_memory without going through
this or the *patch_instruction entry points (which hardcode the size).

> > +       }
> > 
> > -               __put_kernel_nofault(patch_addr, &val, u32,
> > failed);
> > -       } else {
> > -               u64 val = ppc_inst_as_ulong(instr);
> > +       dcbst(patch_addr);
> > +       dcbst(patch_addr + size - 1); /* Last byte of data may
> > cross a cacheline */
> 
> Or the second byte of data may cross a cacheline ...

It might, but unless we are assuming data cachelines smaller than the
native word size it will either be in the first or last byte's
cacheline. Whereas the last byte might be in it's own cacheline.

As justification the comment's misleading though, how about reducing it
to "data may cross a cacheline" and leaving the reason for flushing the
last byte implicit?

> > -static int __do_patch_instruction(u32 *addr, ppc_inst_t instr)
> > +static int __always_inline __do_patch_memory(void *dest, unsigned
> > long src, size_t size)
> >   {
> 
> Whaou, do we really want all this to be __always_inline ? Did you
> check 
> the text size increase ?

These ones are redundant because GCC will already inline them, they
were just part of experimenting inlining ___patch_memory. Will remove
for v3. 

The text size doesn't increase though because the call hierarchy is
just a linear chain of
__patch_memory -> do_patch_memory -> __do_patch_memory

The entry point __patch_memory is not inlined.