[PATCH] powerpc: Don't clear larx reservation on system call exit
Benjamin Herrenschmidt
benh at kernel.crashing.org
Mon Feb 15 13:24:47 EST 2010
On Mon, 2010-02-15 at 12:40 +1100, Anton Blanchard wrote:
> Right now we clear the larx reservation on every system call exit. No code
> should exist that tries to make use of larx/stcx across a system call (because
> it would fail 100% of the time).
>
> We could continue to play it safe but system call latency affects so many
> workloads. In the past we have already cut down the set of registers we
> save and restore across a system call and this could be seen as an
> extension of that. The PowerPC system call ABI does not (and could not)
> preserve a larx reservation.
>
> On POWER6 the poster child for system call improvements, getppid, improves 6%.
>
> A more useful test is the private futex wake system call and that improves 5%.
> This is a decent speedup on an important system call for threaded applications.
>
> Signed-off-by: Anton Blanchard <anton at samba.org>
> ---
> If my previous patches didn't worry you then this one is sure to.
>
> Getting this wrong will make someone's life miserable, so it could do with
> some double checking (eg we don't branch through there on other exceptions and
> we dont invoke system calls from the kernel that rely on the reservation being
> cleared).
Well, the main issue here is leaking kernel reservations into userspace,
and thus the question of whether it is a big deal or not. There's also
an issue I can see with signals.
The risk with kernel reservations leaking into userspace is a problem on
some processors that do not compare the reservation address locally
(only for snoops), thus userspace code doing lwarx/syscall/stwcx. might
end up with a suceeding stwcx. despite the fact that the original
reservation was long lost.
At this stage it becomes an ABI problem, ie, whether we define the
behaviour of a lwarx/stwcx. accross a syscall as defined or not.
The other problem I see is that signal handlers would have to be made
very careful not to leave dangling reservations since the return from
the syscall is a syscall, unless we add code specifically to this (and
set_context too I'd say) to clear reservations.
IE. You could have something like:
lwarx, <interrupt>, signal handler, sigreturn, stwcx.
In the above case, the reservation would be cleared by the return from
the interrupt, but the signal handler might leave a dangling one, which
sigreturn might fail to clear (in practice, our current implementation
of sys_sigreturn() will probably clear any reservation as a side effect
of restore_sigmask() spinlock or set_thread_flag() but it sounds a bit
fragile to rely on unless it's well documented).
Cheers,
Ben.
> Index: powerpc.git/arch/powerpc/kernel/entry_64.S
> ===================================================================
> --- powerpc.git.orig/arch/powerpc/kernel/entry_64.S 2010-02-13 16:26:43.794322638 +1100
> +++ powerpc.git/arch/powerpc/kernel/entry_64.S 2010-02-13 16:27:03.205575405 +1100
> @@ -202,7 +202,6 @@ syscall_exit:
> bge- syscall_error
> syscall_error_cont:
> ld r7,_NIP(r1)
> - stdcx. r0,0,r1 /* to clear the reservation */
> andi. r6,r8,MSR_PR
> ld r4,_LINK(r1)
> /*
More information about the Linuxppc-dev
mailing list