[PATCH v2 0/6] Out-of-line static calls for powerpc64 ELF V2
Christophe Leroy
christophe.leroy at csgroup.eu
Tue Sep 27 00:19:51 AEST 2022
Le 26/09/2022 à 08:43, Benjamin Gray a écrit :
> Implementation of out-of-line static calls for PowerPC 64-bit ELF V2 ABI.
> Static calls patch an indirect branch into a direct branch at runtime.
> Out-of-line specifically has a caller directly call a trampoline, and
> the trampoline gets patched to directly call the target.
>
> Previous version here:
> https://lore.kernel.org/all/20220916062330.430468-1-bgray@linux.ibm.com/
>
> I couldn't see a dedicated ftrace benchmark in the kernel, but my own
> benchmarking showed no significant impact to ftrace activation.
I use the following hack for benchmarking:
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 439e2ab6905e..e7d0d3deb8bf 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2628,10 +2628,11 @@ void __weak ftrace_replace_code(int mod_flags)
bool enable = mod_flags & FTRACE_MODIFY_ENABLE_FL;
int schedulable = mod_flags & FTRACE_MODIFY_MAY_SLEEP_FL;
int failed;
+ int t0;
if (unlikely(ftrace_disabled))
return;
-
+t0 = mftb();
do_for_each_ftrace_rec(pg, rec) {
if (rec->flags & FTRACE_FL_DISABLED)
@@ -2646,6 +2647,8 @@ void __weak ftrace_replace_code(int mod_flags)
if (schedulable)
cond_resched();
} while_for_each_ftrace_rec();
+t0 = mftb() - t0;
+pr_err("%s: %d\n", __func__, t0);
}
struct ftrace_rec_iter {
>
> The __patch_memory function is meant to be accessed through the size checking
> patch_memory wrapper. I don't think there's a way to expose the macro without
> also exposing __patch_memory though. I considered making the type an explicit
> macro param, but using the value type seemed more ergonomic.
>
> V2:
> Mostly accounting for feedback from Christophe:
> * Code patching rewritten
> - Rename to *_memory
> - Use __always_inline to get the compiler to realise it can
> collapse all the sub-functions
> - Pass data directly instead of through a pointer, elliding a redundant load
> - Flush the last byte of data too (technically redundant if an instrucion, but
> saves a conditional branch + the isync will be the bottleneck).
> - Handle a non-cohenrent icache, assume a coherent dcache
> - Handle when we don't assume a 64 byte icache on 64-bits
> - Flatten the poke address init and teardown
> - Check the data size in patch_memory at build time
> (inline function was suggested, but a macro makes checking
> based on the data type easier).
> - It builds now on 32 bit and without strict RWX
> * Static call enabling is no longer configurable
> * Refactored arch_static_call_transform to minimise casting
> * Made the KUnit tests more robust (previously they changed non-volatile
> registers in the init hook, but that's incorrect because it returns to
> the KUnit framework before the test case is called).
> * Some other minor refactoring in other patches
>
>
> Benjamin Gray (6):
> powerpc/code-patching: Implement generic text patching function
> powerpc/module: Handle caller-saved TOC in module linker
> powerpc/module: Optimise nearby branches in ELF V2 ABI stub
> static_call: Move static call selftest to static_call_selftest.c
> powerpc/64: Add support for out-of-line static calls
> powerpc/64: Add tests for out-of-line static calls
>
> arch/powerpc/Kconfig | 12 +-
> arch/powerpc/include/asm/code-patching.h | 8 +
> arch/powerpc/include/asm/static_call.h | 80 +++++++-
> arch/powerpc/kernel/Makefile | 4 +-
> arch/powerpc/kernel/module_64.c | 27 ++-
> arch/powerpc/kernel/static_call.c | 151 +++++++++++++-
> arch/powerpc/kernel/static_call_test.c | 251 +++++++++++++++++++++++
> arch/powerpc/kernel/static_call_test.h | 56 +++++
> arch/powerpc/lib/code-patching.c | 90 +++++---
> kernel/Makefile | 1 +
> kernel/static_call_inline.c | 43 ----
> kernel/static_call_selftest.c | 41 ++++
> 12 files changed, 682 insertions(+), 82 deletions(-)
> create mode 100644 arch/powerpc/kernel/static_call_test.c
> create mode 100644 arch/powerpc/kernel/static_call_test.h
> create mode 100644 kernel/static_call_selftest.c
>
>
> base-commit: 3d7a198cfdb47405cfb4a3ea523876569fe341e6
> --
> 2.37.3
More information about the Linuxppc-dev
mailing list