[powerpc] ftrace warning kernel/trace/ftrace.c:2068 with code-patching selftests

Thu Jan 27 23:59:31 AEDT 2022

On Thu, Jan 27, 2022 at 01:22:17PM +0100, Ard Biesheuvel wrote:
> On Thu, 27 Jan 2022 at 13:20, Mark Rutland <mark.rutland at arm.com> wrote:
> > On Thu, Jan 27, 2022 at 01:03:34PM +0100, Ard Biesheuvel wrote:
> >
> > > These architectures use place-relative extables for the same reason:
> > > place relative references are resolved at build time rather than at
> > > runtime during relocation, making a build time sort feasible.
> > >
> > > arch/alpha/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/arm64/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/ia64/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/parisc/include/asm/uaccess.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/powerpc/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/riscv/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/s390/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > > arch/x86/include/asm/extable.h:#define ARCH_HAS_RELATIVE_EXTABLE
> > >
> > > Note that the swap routine becomes something like the below, given
> > > that the relative references need to be fixed up after the entry
> > > changes place in the sorted list.
> > >
> > > static void swap_ex(void *a, void *b, int size)
> > > {
> > >         struct exception_table_entry *x = a, *y = b, tmp;
> > >         int delta = b - a;
> > >
> > >         tmp = *x;
> > >         x->insn = y->insn + delta;
> > >         y->insn = tmp.insn - delta;
> > >         ...
> > > }
> > >
> > > As a bonus, the resulting footprint of the table in the image is
> > > reduced by 8x, given that every 8 byte pointer has an accompanying 24
> > > byte RELA record, so we go from 32 bytes to 4 bytes for every call to
> > > __gnu_mcount_mc.
> >
> > Absolutely -- it'd be great if we could do that for the callsite locations; the
> > difficulty is that the entries are generated by the compiler itself, so we'd
> > either need some build/link time processing to convert each absolute 64-bit
> > value to a relative 32-bit offset, or new compiler options to generate those as
> > relative offsets from the outset.
> 
> Don't we use scripts/recordmcount.pl for that?

Not quite -- we could adjust it to do so, but today it doesn't consider
existing mcount_loc entries, and just generates new ones where the compiler has
generated calls to mcount, which it finds by scanning the instructions in the
binary. Today it is not used:

* On arm64 when we default to using `-fpatchable-function-entry=N`.  That makes
  the compiler insert 2 NOPs in the function prologue, and log the location of
  that NOP sled to a section called.  `__patchable_function_entries`. 

  We need the compiler to do that since we can't reliably identify 2 NOPs in a
  function prologue as being intended to be a patch site, as e.g. there could
  be notrace functions where the compiler had to insert NOPs for alignment of a
  subsequent brnach or similar.

* On architectures with `-nop-mcount`. On these, it's necessary to use
  `-mrecord-mcount` to have the compiler log the patch-site, for the same
  reason as with `-fpatchable-function-entry`.

* On architectures with `-mrecord-mcount` generally, since this avoids the
  overhead of scanning each object.

* On x86 when objtool is used.

Thanks,
Mark.