[PATCH 0/5] powerpc: Implement masked user access
David Laight
david.laight.linux at gmail.com
Tue Jun 24 18:32:58 AEST 2025
On Tue, 24 Jun 2025 07:27:47 +0200
Christophe Leroy <christophe.leroy at csgroup.eu> wrote:
> Le 22/06/2025 à 18:20, David Laight a écrit :
> > On Sun, 22 Jun 2025 11:52:38 +0200
> > Christophe Leroy <christophe.leroy at csgroup.eu> wrote:
> >
> >> Masked user access avoids the address/size verification by access_ok().
> >> Allthough its main purpose is to skip the speculation in the
> >> verification of user address and size hence avoid the need of spec
> >> mitigation, it also has the advantage to reduce the amount of
> >> instructions needed so it also benefits to platforms that don't
> >> need speculation mitigation, especially when the size of the copy is
> >> not know at build time.
> >
> > It also removes a conditional branch that is quite likely to be
> > statically predicted 'the wrong way'.
>
> But include/asm-generic/access_ok.h defines access_ok() as:
>
> #define access_ok(addr, size) likely(__access_ok(addr, size))
>
> So GCC uses the 'unlikely' variant of the branch instruction to force
> the correct prediction, doesn't it ?
Nope...
Most architectures don't have likely/unlikely variants of branches.
So all gcc can do is decide which path is the fall-through and
whether the branch is forwards or backwards.
Additionally unless there is code in both the 'if' and 'else' clauses
the [un]likely seems to have no effect.
So on simple cpu that predict 'backwards branches taken' you can get
the desired effect - but it may need an 'asm comment' to force the
compiler to generate the required branches (eg forwards branch directly
to a backwards unconditional jump).
On x86 it is all more complicated.
I think the pre-fetch code is likely to assume 'not taken' (but might
use stale info on the cache line).
The predictor itself never does 'static prediction' - it is always
based on the referenced branch prediction data structure.
So, unless you are in a loop (eg running a benchmark!) there is pretty
much a 50% chance of a branch mispredict.
I've been trying to benchmark different versions of the u64 * u64 / u64
function - and I think mispredicted branches make a big difference.
I need to sit down and sequence the test cases so that I can see
the effect of each branch!
>
> >
> >> Unlike x86_64 which masks the address to 'all bits set' when the
> >> user address is invalid, here the address is set to an address in
> >> the gap. It avoids relying on the zero page to catch offseted
> >> accesses. On book3s/32 it makes sure the opening remains on user
> >> segment. The overcost is a single instruction in the masking.
> >
> > That isn't true (any more).
> > Linus changed the check to (approx):
> > if (uaddr > TASK_SIZE)
> > uaddr = TASK_SIZE;
> > (Implemented with a conditional move)
>
> Ah ok, I overlooked that, I didn't know the cmove instruction, seem
> similar to the isel instruction on powerpc e500.
It got added for the 386 - I learnt 8086 :-)
I suspect x86 got there first...
Although called 'conditional move' I very much suspect the write is
actually unconditional.
So the hardware implementation is much the same as 'add carry' except
the ALU operation is a simple multiplex.
Which means it is unlikely to be speculative.
David
>
> Christophe
>
More information about the Linuxppc-dev
mailing list