[PATCH] powerpc/64: allow compiler to cache 'current'
Michael Ellerman
patch-notifications at ellerman.id.au
Thu Aug 22 23:08:53 AEST 2019
On Wed, 2019-06-12 at 14:03:17 UTC, Nicholas Piggin wrote:
> current may be cached by the compiler, so remove the volatile asm
> restriction. This results in better generated code, as well as being
> smaller and fewer dependent loads, it can avoid store-hit-load flushes
> like this one that shows up in irq_exit():
>
> preempt_count_sub(HARDIRQ_OFFSET);
> if (!in_interrupt() && ...)
>
> Which ends up as:
>
> ((struct thread_info *)current)->preempt_count -= HARDIRQ_OFFSET;
> if (((struct thread_info *)current)->preempt_count ...
>
> Evaluating current twice presently means it has to be loaded twice, and
> here gcc happens to pick a different register each time, then
> preempt_count is accessed via that base register:
>
> 1058: ld r10,2392(r13) <-- current
> 105c: lwz r9,0(r10) <-- preempt_count
> 1060: addis r9,r9,-1
> 1064: stw r9,0(r10) <-- preempt_count
> 1068: ld r9,2392(r13) <-- current
> 106c: lwz r9,0(r9) <-- preempt_count
> 1070: rlwinm. r9,r9,0,11,23
> 1074: bne 1090 <irq_exit+0x60>
>
> This can frustrate store-hit-load detection heuristics and cause
> flushes. Allowing the compiler to cache current in a reigster with this
> patch results in the same base register being used for all accesses,
> which is more likely to be detected as an alias:
>
> 1058: ld r31,2392(r13)
> ...
> 1070: lwz r9,0(r31)
> 1074: addis r9,r9,-1
> 1078: stw r9,0(r31)
> 107c: lwz r9,0(r31)
> 1080: rlwinm. r9,r9,0,11,23
> 1084: bne 10a0 <irq_exit+0x60>
>
> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/e354d7dc81d0e81bea33165f381aff1eda45f5d9
cheers
More information about the Linuxppc-dev
mailing list