[RFC PATCH 0/4] Out-of-line static calls for powerpc64 ELF V2

Christophe Leroy christophe.leroy at csgroup.eu
Thu Sep 1 18:07:06 AEST 2022


CCing static call maintainers/reviewers.

And note that my email address has changed to 
christophe.leroy at csgroup.eu monthes ago.

Le 01/09/2022 à 07:58, Benjamin Gray a écrit :
> [Vous ne recevez pas souvent de courriers de bgray at linux.ibm.com. Découvrez pourquoi ceci est important à https://aka.ms/LearnAboutSenderIdentification ]
> 
> WIP implementation of out-of-line static calls for PowerPC 64-bit ELF V2 ABI.
> Static calls patch an indirect branch into a direct branch at runtime.
> Out-of-line specifically has a caller directly call a trampoline, and
> the trampoline gets patched to directly call the target. This current
> implementation has a known issue described in detail below, and is
> presented here for any comments or suggestions.

For a wider audience I recommend you to copy people from the core STATIC 
BRANCH/CALL (see MAINTAINERS file)


> 
> 64-bit ELF V2 specifies a table of contents (TOC) pointer stored in r2.
> Functions that use a TOC can use it to perform various operations
> relative to its value. When the caller and target use different TOCs,
> the static call implementation must ensure the TOC is kept consistent
> so that neither tries to use the other's TOC.
> 
> However, while the trampoline can change the caller's TOC to the target's
> TOC, it cannot restore the caller's TOC when the target returns. For the
> trampoline to do this would require the target to return to the trampoline,
> and so the return address back to the caller would need to be saved to
> the stack. But the trampoline cannot move the stack because the target
> may be expecting parameters relative to the stack pointer (when registers
> are insufficient or varargs are used). And as static calls are usable in
> generic code, there can be no arch-specific restrictions on parameters
> that would sidestep this issue.
> 
> Normally the TOC change issue is resolved by the caller, which will save
> and restore its TOC if necessary. For static calls though the caller
> sees the trampoline as a local function, so assumes it does not change
> the TOC and treats r2 as nonvolatile (no save & restore added).
> 
> This is a simialar problem to that faced by livepatch. Static calls may have
> a few more options though, because the call is performed through a
> `static_call` macro, allowing annotation and insertion of inline assembly
> at every callsite.
> 
> I can think of several possible solutions, but they are relatively complex:
> 
> 1. Patching the callsites at runtime, as is done for inline static calls.
>      This also requires some inline assembly to save `r2` to the TOC pointer
>      Doubleword slot on the stack before each static call, as the caller may
>      not have done so in its prologue. It should be easy to add though, because
>      static calls are invoked through the `static_call` macro that can be
>      modified appropriately. The runtime patching then modifies the trailing
>      function call `nop` to restore this r2 value.

I'm working at implementing inline static calls for ppc32. Will copy you 
next spin (If I don't forget).

> 
>      The patching itself can probably be done at compile time at kernel callsites.
> 
> 2. Use the livepatch stack method of growing the base of the stack backwards.
>      I haven't looked too closely at the implementation though, especially
>      regarding how much room is available.
> 
>      The benefit of this method is that there can be zero overhead when the
>      trampoline and target share a TOC. So the trampoline in kernel-only
>      calls can still just be a single direct branch.
> 
> 3. Remove the local entry point from the trampoline. This makes the trampoline
>      less efficient, as it cannot assume r2 to be correct, but should at least
>      cause the caller to automatically save and restore r2 without manual patching.
>      From the ABI manual:
> 
>      > 2.2.1. Function Call Linkage Protocols
>      >   A function that uses a TOC pointer always has a separate local entry point
>      >   [...], and preserves r2 when called via its local entry point.
>      >
>      > 2.2.2.1. Register Roles
>      >   (a) Register r2 is nonvolatile with respect to calls between functions
>      >       in the same compilation unit, except under the conditions in footnote (b)
>      >   (b) Register r2 is volatile and available for use in a function that does not
>      >       use a TOC pointer and that does not guarantee that it preserves r2.
> 
>      So not having a local entry point implies not using a TOC pointer, which
>      implies r2 is volatile if the trampoline does not guarantee that it preserves
>      r2. However experimenting with such a trampoline showed the caller still did
>      not preserve its TOC when necessary, even when the trampoline used instructions
>      that wrote to r2. Possibly there's an attribute that can be used to mark the
>      necessary info, but I could not find one.
> 

Another possible solution (at least for kernel) is to restore r2 from 
PACA instead of restoring it from the stack. So no worry whether the 
caller stored it or not. Something similar is done by module code, see 
comment before create_ftrace_stub()



> 
> Benjamin Gray (3):
>    static_call: Move static call selftest to static_call.c
>    powerpc/64: Add support for out-of-line static calls
>    powerpc/64: Add tests for out-of-line static calls
> 
> Russell Currey (1):
>    powerpc/code-patching: add patch_memory() for writing RO text
> 
>   arch/powerpc/Kconfig                     |  23 +-
>   arch/powerpc/include/asm/code-patching.h |   2 +
>   arch/powerpc/include/asm/static_call.h   |  45 +++-
>   arch/powerpc/kernel/Makefile             |   4 +-
>   arch/powerpc/kernel/static_call.c        | 184 +++++++++++++++-
>   arch/powerpc/kernel/static_call_test.c   | 257 +++++++++++++++++++++++
>   arch/powerpc/lib/code-patching.c         |  65 ++++++
>   kernel/static_call.c                     |  43 ++++
>   kernel/static_call_inline.c              |  43 ----
>   9 files changed, 613 insertions(+), 53 deletions(-)
>   create mode 100644 arch/powerpc/kernel/static_call_test.c
> 
> 
> base-commit: c5e4d5e99162ba8025d58a3af7ad103f155d2df7
> --
> 2.37.2


More information about the Linuxppc-dev mailing list