[PATCH v3 1/6] powerpc/code-patching: Implement generic text patching function

Benjamin Gray bgray at linux.ibm.com
Fri Oct 7 08:53:53 AEDT 2022


On Thu, 2022-10-06 at 09:19 +0000, Christophe Leroy wrote:
> 
> 
> Le 06/10/2022 à 05:36, Benjamin Gray a écrit :
> > On Wed, 2022-10-05 at 17:55 +0000, Christophe Leroy wrote:
> > > I'm on business trip this week so I can't test it on hardware,
> > > but
> > > the
> > > generated code looks horrid and sub-optimal, with a stack frame
> > > and
> > > so
> > > many registers saved into it. That's mpc885_ads_defconfig built
> > > with
> > > GCC
> > > 12, without modules without stackprotector with 4k pages.
> > 
> > Yeah, that definitely wasn't supposed to happen. I was looking at
> > the
> > 32 and 64 bit machine code closely while actively writing it, but I
> > must have refactored it badly when cleaning up for submission. It
> > was
> > supposed to automatically be inlined, leaving it identical to the
> > original on 32-bit.
> > 
> > Given inlining is desirable here, and 64-bit inlines anyway, I
> > think I
> > should just mark __patch_memory with __always_inline.
> 
> FWIW, I get a far better result with :
> 
> diff --git a/arch/powerpc/lib/code-patching.c 
> b/arch/powerpc/lib/code-patching.c
> index ba00c550d9d2..447b8de6e427 100644
> --- a/arch/powerpc/lib/code-patching.c
> +++ b/arch/powerpc/lib/code-patching.c
> @@ -21,7 +21,7 @@ static int __patch_memory(void *patch_addr,
> unsigned 
> long val, void *exec_addr,
>         /* Prefixed instruction may cross cacheline if cacheline
> smaller than 
> 64 bytes */
>         BUILD_BUG_ON(IS_ENABLED(CONFIG_PPC64) && L1_CACHE_BYTES <
> 64);
> 
> -       if (unlikely(is_dword))
> +       if (IS_ENABLED(CONFIG_PPC64) && unlikely(is_dword))
>                 __put_kernel_nofault(patch_addr, &val, u64, failed);
>         else
>                 __put_kernel_nofault(patch_addr, &val, u32, failed);

That's very interesting, as that's what I had originally. I removed the
IS_ENABLED(CONFIG_PPC64) part of the if condition as part of submission
cleanup refactoring because it should be redundant after constant
propagation. The weird thing is that GCC constant propagates either
way, it's just changing it's mind about inlining.

I can add it back, but the optimisation here seems very fragile. It
feels more reliable just to specify it explicitly.


More information about the Linuxppc-dev mailing list