[PATCH] x86/uaccess: Avoid barrier_nospec() in copy_from_user()

Fri Oct 25 19:56:16 AEDT 2024

...
> access_ok() itself is so rarely used these days that we could out-line
> it.  But the code cost of a function call is likely higher than
> inlining the 8-byte constant and a couple of instructions: not because
> the call instruction itself, but because of the code generation pain
> around it (register save/restore etc).

I was thinking it might need a non-standard call sequence so that
it could return yes/no and mask the address.

I did once wonder whether it could just mask the address.
Converting kernel addresses into ones that were guaranteed
to fault later.
Annoyingly I don't think that NULL can be used, maybe ~0 faults
for byte accesses - or is a very boring byte.
Certainly loading the TLB/cache line for ~0 isn't going to be useful.

...
> And with how slow 'ret' instructions can be, we really don't want
> small function calls.

Can that matter for leaf functions?
IIRC the issue is return stack underflow causing indirect branch
prediction be used - probably worse than just using wrapped
return stack value!
That won't happen for a leaf function unless you get an interrupt.
Static analysis could determine function returns that can never
underflow the return stack.

> In fact, it really makes me wonder if we should inline 'get_user()' entirely.

I think that would bloat some cold paths.
I know I've put a wrapper on copy_to/from_user() in the past.
(I don't think that is inlined at all these days?)
So you might want two copies.

An entirely horrid way is to have #define get_user(args) inline it
then (get_user)(args) would call the real function.

Mind you there are a lot of get_user() and put_user() in the getsockopt()
functions - which could all have a kernel address passed and
that access done by the wrapper.
A simple change to a kernel address doesn't change the prototype
so the complier doesn't detect unchanged code.
I've tried changing it to:
	new_size_or_errno = getsockopt(..., old_size)
but there are a few places that want to return an error and
update the size.
Plausibly returning (errno << n | size) from those places would
solve the problem - and the extra check would be in the error path.

Trouble is it is a massive patch series :-(

> > Especially if there is always a (PAGE sized) gap between the highest
> > user address and the lowest kernel address so the 'size' argument
> > to access_ok() can be ignored on the assumption that the accesses
> > are (reasonably) linear.
> 
> Yes, that's what we do right now for the inline code generation anyway.

Is that gap there on all architectures?

> (Ok, we actually do take the 'size' value into account when it's not
> constant, or when it's huge, but even that is just out of a silly
> abundance of caution and not a very common case).

Hmm the buffers for all read/write (etc) calls are variable length.
So it is a moderately common case.

	David

> 
>                 Linus

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)