[PATCH] powerpc/tm: fix live state of vs0/32 in tm_reclaim

Cyril Bur cyrilbur at gmail.com
Thu Oct 26 15:57:33 AEDT 2017


On Wed, 2017-07-05 at 11:02 +1000, Michael Neuling wrote:
> On Tue, 2017-07-04 at 16:45 -0400, Gustavo Romero wrote:
> > Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> > due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> 

Hi Mikey,

This completely fell off my radar, we do need something merged!

For what its worth I like the original patch.

> tm_reclaim() should have no state live in the registers once it returns.  It
> should all be saved in the thread struct. The above is not an issue in my book.
> 

Yeah, this is something I agree with, however, if that is the case then
why have tm_recheckpoint() do partial reloads?

A partial reload only makes sense if we can be sure that reclaim will
have left the state at least (partially) correct - not with (as is the
case today) one corrupted fp or Altivec reg.

> Having a quick look at the code, I think there's and issue but we need something
> more like this (completely untested).
> 
> When we recheckpoint inside an fp unavail, we need to recheckpoint vec if it was
> enabled.  Currently we only ever recheckpoint the FP which seems like a bug. 
> Visa versa for the other way around.
> 

In your example, we don't need to reload VEC if we can trust that
reclaim left the checkpointed regs on the CPU correctly - this patch
achieves this.

Of course I'm more than happy to reduce complexity and not have this
optimisation at all but then we should remove the entire parameter to
tm_recheckpoint(). Any in between feels dangerous.


Cyril


> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index d4e545d27e..d1184264e2 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -1589,7 +1589,7 @@ void fp_unavailable_tm(struct pt_regs *regs)
>          * If VMX is in use, the VRs now hold checkpointed values,
>          * so we don't want to load the VRs from the thread_struct.
>          */
> -       tm_recheckpoint(&current->thread, MSR_FP);
> +       tm_recheckpoint(&current->thread, regs->msr);
>  
>         /* If VMX is in use, get the transactional values back */
>         if (regs->msr & MSR_VEC) {
> @@ -1611,7 +1611,7 @@ void altivec_unavailable_tm(struct pt_regs *regs)
>                  regs->nip, regs->msr);
>         tm_reclaim_current(TM_CAUSE_FAC_UNAV);
>         regs->msr |= MSR_VEC;
> -       tm_recheckpoint(&current->thread, MSR_VEC);
> +       tm_recheckpoint(&current->thread, regs->msr);
>         current->thread.used_vr = 1;
>  
>         if (regs->msr & MSR_FP) {
> 
> 
> > Later, we recheckpoint trusting that the live state of FP and VEC are ok
> > depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> > means the FP registers checkpointed when we entered in TM are correct and
> > after a treclaim. we can trust the FP live state. Similarly to VEC regs.
> > However if tm_reclaim() does not return a sane state then tm_recheckpoint()
> > will recheckpoint a corrupted state from live state back to the checkpoint
> > area.
> 
> 
> 
> 
> > That commit fixes the corruption by restoring vs0 and vs32 from the
> > ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> > respectively.
> > 
> > The effect of the issue described above is observed, for instance, once a
> > VSX unavailable exception is caught in the middle of a transaction with
> > MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
> > space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.
> > 
> > The issue does not occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state
> > and ckvr_state are both copied from fp_state and vr_state, respectively,
> > and on recheckpointing both states will be restored from these thread
> > structures and not from the live state.
> > 
> > The issue does not occur also if MSR.FP = 1 and MSR.VEC = 1 because it
> > implies MSR.VSX = 1 and in that case the VSX unavailable exception does not
> > happen in the middle of the transactional block.
> > 
> > Finally, that commit also fixes the MSR used to check if FP and VEC bits
> > are enabled once we are in tm_reclaim_thread(). ckpt_regs.msr is valid only
> > if giveup_all() is called *before* using ckpt_regs.msr for checks because
> > check_if_tm_restore_required() in giveup_all() will copy regs->msr to
> > ckpt_regs.msr and so ckpt_regs.msr reflects exactly the MSR that the thread
> > had when it came off the processor.
> > 
> > No regression was observed on powerpc/tm selftests after this fix.
> > 
> > Signed-off-by: Gustavo Romero <gromero at linux.vnet.ibm.com>
> > Signed-off-by: Breno Leitao <leitao at debian.org>
> > ---
> >  arch/powerpc/kernel/process.c |  9 +++++++--
> >  arch/powerpc/kernel/tm.S      | 14 ++++++++++++++
> >  2 files changed, 21 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index 2ad725e..ac1fc51 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -864,6 +864,13 @@ static void tm_reclaim_thread(struct thread_struct *thr,
> >  	if (!MSR_TM_SUSPENDED(mfmsr()))
> >  		return;
> >  
> > +	/* Copy regs->msr to ckpt_regs.msr making the last valid for
> > +	 * the checks below. check_if_tm_restore_required() in
> > +	 * giveup_all() will take care of it. Also update fp_state
> > +	 * and vr_state from live state if the live state is valid.
> > +	 */
> > +	giveup_all(container_of(thr, struct task_struct, thread));
> > +
> >  	/*
> >  	 * If we are in a transaction and FP is off then we can't have
> >  	 * used FP inside that transaction. Hence the checkpointed
> > @@ -883,8 +890,6 @@ static void tm_reclaim_thread(struct thread_struct *thr,
> >  		memcpy(&thr->ckvr_state, &thr->vr_state,
> >  		       sizeof(struct thread_vr_state));
> >  
> > -	giveup_all(container_of(thr, struct task_struct, thread));
> > -
> >  	tm_reclaim(thr, thr->ckpt_regs.msr, cause);
> >  }
> >  
> > diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
> > index 3a2d041..5dfbddb 100644
> > --- a/arch/powerpc/kernel/tm.S
> > +++ b/arch/powerpc/kernel/tm.S
> > @@ -259,9 +259,17 @@ _GLOBAL(tm_reclaim)
> >  
> >  	addi	r7, r3, THREAD_CKVRSTATE
> >  	SAVE_32VRS(0, r6, r7)	/* r6 scratch, r7 transact vr state */
> > +
> > +	/* Corrupting v0 (vs32). Should restore it later. */
> >  	mfvscr	v0
> >  	li	r6, VRSTATE_VSCR
> >  	stvx	v0, r7, r6
> > +
> > +	/* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might
> > +	 * recheckpoint the wrong live value.
> > +	 */
> > +	LXVD2X_ROT(32, R0, R7)
> > +
> >  dont_backup_vec:
> >  	mfspr	r0, SPRN_VRSAVE
> >  	std	r0, THREAD_CKVRSAVE(r3)
> > @@ -272,9 +280,15 @@ dont_backup_vec:
> >  	addi	r7, r3, THREAD_CKFPSTATE
> >  	SAVE_32FPRS_VSRS(0, R6, R7)	/* r6 scratch, r7 transact fp
> > state */
> >  
> > +	/* Corrupting fr0 (vs0). Should restore it later. */
> >  	mffs    fr0
> >  	stfd    fr0,FPSTATE_FPSCR(r7)
> >  
> > +	/* Restore fr0 (vs0) from ckfp_state.fpr[0], otherwise we might
> > +	 * recheckpoint the wrong live value.
> > +	 */
> > +	LXVD2X_ROT(0, R0, R7)
> > +
> >  dont_backup_fp:
> >  
> >  	/* TM regs, incl TEXASR -- these live in thread_struct.  Note they've


More information about the Linuxppc-dev mailing list