[PATCH] powerpc/tm: Set MSR[TS] just prior to recheckpoint
Michael Neuling
mikey at neuling.org
Tue Nov 20 10:30:52 AEDT 2018
On Mon, 2018-11-19 at 10:44 -0200, Breno Leitao wrote:
> On a signal handler return, the user could set a context with MSR[TS] bits
> set, and these bits would be copied to task regs->msr.
>
> At restore_tm_sigcontexts(), after current task regs->msr[TS] bits are set,
> several __get_user() are called and then a recheckpoint is executed.
Do we have the same problem on the entry side? There are a bunch of
copy_from_user() before we do a tm_reclaim() (ie when still transactional). Is
there some chance of the same problem there?
> This is a problem since a page fault (in kernel space) could happen when
> calling __get_user(). If it happens, the process MSR[TS] bits were
> already set, but recheckpoint was not executed, and SPRs are still invalid.
>
> The page fault can cause the current process to be de-scheduled, with
> MSR[TS] active and without tm_recheckpoint() being called. More
> importantly, without TEXAR[FS] bit set also.
This patch is great and should go in ASAP
> Since TEXASR might not have the FS bit set, and when the process is
> scheduled back, it will try to reclaim, which will be aborted because of
> the CPU is not in the suspended state, and, then, recheckpoint. This
> recheckpoint will restore thread->texasr into TEXASR SPR, which might be
> zero, hitting a BUG_ON().
>
> [ 2181.457997] kernel BUG at arch/powerpc/kernel/tm.S:446!
Can you put more of the backtrace here? Can be useful for people searching for
a similar problem.
> This patch simply delays the MSR[TS] set, so, if there is any page fault in
> the __get_user() section, it does not have regs->msr[TS] set, since the TM
> structures are still invalid, thus avoiding doing TM operations for
> in-kernel exceptions and possible process reschedule.
Can you put a comment in the code what says after the MSR[TS] setting, there can
be no get/put_user before the recheckpoint?
Also, it looks like the 32bit signals case is safe, but please add a comment in
there too.
> With this patch, the MSR[TS] will only be set just before recheckpointing
> and setting TEXASR[FS] = 1, thus avoiding an interrupt with TM registers in
> invalid state.
>
> It is not possible to move tm_recheckpoint to happen earlier, because it is
> required to get the checkpointed registers from userspace, with
> __get_user(), thus, the only way to avoid this undesired behavior is
> delaying the MSR[TS] set, as done in this patch.
>
> Fixes: 87b4e5393af7 ("powerpc/tm: Fix return of active 64bit signals")
> Cc: stable at vger.kernel.org (v3.9+)
> Signed-off-by: Breno Leitao <leitao at debian.org>
> ---
> arch/powerpc/kernel/signal_64.c | 29 +++++++++++++++--------------
> 1 file changed, 15 insertions(+), 14 deletions(-)
>
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index 83d51bf586c7..15b153bdd826 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -467,20 +467,6 @@ static long restore_tm_sigcontexts(struct task_struct *tsk,
> if (MSR_TM_RESV(msr))
> return -EINVAL;
>
> - /* pull in MSR TS bits from user context */
> - regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr & MSR_TS_MASK);
> -
> - /*
> - * Ensure that TM is enabled in regs->msr before we leave the signal
> - * handler. It could be the case that (a) user disabled the TM bit
> - * through the manipulation of the MSR bits in uc_mcontext or (b) the
> - * TM bit was disabled because a sufficient number of context switches
> - * happened whilst in the signal handler and load_tm overflowed,
> - * disabling the TM bit. In either case we can end up with an illegal
> - * TM state leading to a TM Bad Thing when we return to userspace.
> - */
> - regs->msr |= MSR_TM;
> -
> /* pull in MSR LE from user context */
> regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
>
> @@ -572,6 +558,21 @@ static long restore_tm_sigcontexts(struct task_struct *tsk,
> tm_enable();
> /* Make sure the transaction is marked as failed */
> tsk->thread.tm_texasr |= TEXASR_FS;
> +
> + /* pull in MSR TS bits from user context */
> + regs->msr = (regs->msr & ~MSR_TS_MASK) | (msr & MSR_TS_MASK);
> +
> + /*
> + * Ensure that TM is enabled in regs->msr before we leave the signal
> + * handler. It could be the case that (a) user disabled the TM bit
> + * through the manipulation of ararararthe MSR bits in uc_mcontext or (b) the
> + * TM bit was disabled because a sufficient number of context switches
> + * happened whilst in the signal handler and load_tm overflowed,
> + * disabling the TM bit. In either case we can end up with an illegal
> + * TM state leading to a TM Bad Thing when we return to userspace.
> + */
> + regs->msr |= MSR_TM;
> +
> /* This loads the checkpointed FP/VEC state, if used */
> tm_recheckpoint(&tsk->thread);
>
More information about the Linuxppc-dev
mailing list