[PATCH 1/2] powerpc/tm: fix live state of vs0/32 in tm_reclaim

Cyril Bur cyrilbur at gmail.com
Tue Jul 4 10:19:00 AEST 2017


On Thu, 2017-06-29 at 20:44 -0400, Gustavo Romero wrote:
> Currently tm_reclaim() can return with a corrupted vs0 (fp0) or vs32 (v0)
> due to the fact vs0 is used to save FPSCR and vs32 is used to save VSCR.
> 
> Later, we recheckpoint trusting that the live state of FP and VEC are ok
> depending on the MSR.FP and MSR.VEC bits, i.e. if MSR.FP is enabled that
> means the FP registers checkpointed before entering are correct and so
> after a treclaim. we can trust the FP live state, and similarly on VEC regs.
> However if tm_reclaim() does not return a sane state then tm_recheckpoint()
> will recheckpoint a corrupted instate back to the checkpoint area.
> 
> That commit fixes the corruption by restoring vs0 and vs32 from the
> ckfp_state and ckvr_state after they are used to save FPSCR and VSCR,
> respectively.
> 
> The effect of the issue described above is observed, for instance, once a
> VSX unavailable exception is caught in the middle of a transaction with
> MSR.FP = 1 or MSR.VEC = 1. If MSR.FP = 1, then after getting back to user
> space FP state is corrupted. If MSR.VEC = 1, then VEC state is corrupted.
> 
> The issue does occur if MSR.FP = 0 and MSR.VEC = 0 because ckfp_state and
> ckvr_state are both copied from fp_state and vr_state, respectively, and
> on recheckpointing both states will be restored from these thread
> structures and not from the live state.
> 
> The issue does no occur also if MSR.FP = 1 and MSR.VEC = 1 because it
> implies MSR.VSX = 1 and in that case the VSX unavailable exception does not
> happen in the middle of the transactional block.
> 
> Finally, that commit also fixes the MSR used to check if FP or VEC bit are
> enabled once we are in tm_reclaim_thread(). Checking ckpt_regs.msr is not
> correct because giveup_all(), which copies regs->msr to ckpt_regs.msr, was
> not called before and so the ckpt_regs.msr at that point is not valid, i.e.
> it does not reflect the MSR state in userspace.
> 
> No regression was observed on powerpc/tm selftests after this fix.
> 
> Signed-off-by: Gustavo Romero <gromero at linux.vnet.ibm.com>
> Signed-off-by: Breno Leitao <leitao at debian.org>
> ---
>  arch/powerpc/kernel/process.c | 15 ++++++++-------
>  arch/powerpc/kernel/tm.S      | 16 ++++++++++++++++
>  2 files changed, 24 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 2ad725e..df8e348 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -870,21 +870,22 @@ static void tm_reclaim_thread(struct thread_struct *thr,
>  	 * state is the same as the live state. We need to copy the
>  	 * live state to the checkpointed state so that when the
>  	 * transaction is restored, the checkpointed state is correct
> -	 * and the aborted transaction sees the correct state. We use
> -	 * ckpt_regs.msr here as that's what tm_reclaim will use to
> -	 * determine if it's going to write the checkpointed state or
> -	 * not. So either this will write the checkpointed registers,
> -	 * or reclaim will. Similarly for VMX.
> +	 * and the aborted transaction sees the correct state. So
> +	 * either this will write the checkpointed registers, or
> +	 * reclaim will. Similarly for VMX.
>  	 */
> -	if ((thr->ckpt_regs.msr & MSR_FP) == 0)
> +	if ((thr->regs->msr & MSR_FP) == 0)
>  		memcpy(&thr->ckfp_state, &thr->fp_state,
>  		       sizeof(struct thread_fp_state));
> -	if ((thr->ckpt_regs.msr & MSR_VEC) == 0)
> +	if ((thr->regs->msr & MSR_VEC) == 0)
>  		memcpy(&thr->ckvr_state, &thr->vr_state,
>  		       sizeof(struct thread_vr_state));
>  

So the name of that variable is horrifying - I don't know why it is
called that I expect to save 'space' but its not helping anyone. Poor
variable names not withstanding I believe the original code is correct.

What the chkp_regs.msr means here is, the MSR that the thread had when
it when it came off the processor. The reason for this is that
giveup_fpu (and friends) will modify thread.regs->msr when doing the
giveup so when reclaiming we can't trust it to know what the live state
really was. check_if_tm_restore_required() copies the thread.regs->msr
into the checkpointed structure so we know if the thread was really
using FP/VMX/VSX or not. check_if_tm_restore_required() is called
before any giveup operation.

I do wonder if it would make more sense for the giveup_all() below to
be above. I don't think there are any code pathes that can get here
without having already done a giveup_all() but, perhaps it is possible.
I don't think it will hurt to move it up, it feels more correct

>  	giveup_all(container_of(thr, struct task_struct, thread));
>  
> +	/* After giveup_all(), ckpt_regs.msr contains the same value
> +	 * of regs->msr when we entered in kernel space.
> +	 */
>  	tm_reclaim(thr, thr->ckpt_regs.msr, cause);
>  }
>  
> diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S
> index 3a2d041..67b56af 100644
> --- a/arch/powerpc/kernel/tm.S
> +++ b/arch/powerpc/kernel/tm.S
> @@ -259,9 +259,18 @@ _GLOBAL(tm_reclaim)
>  
>  	addi	r7, r3, THREAD_CKVRSTATE
>  	SAVE_32VRS(0, r6, r7)	/* r6 scratch, r7 transact vr state */
> +
> +	/* Corrupting v0 (vs32). Should restore it later. */
>  	mfvscr	v0
>  	li	r6, VRSTATE_VSCR
>  	stvx	v0, r7, r6
> +
> +	/* Restore v0 (vs32) from ckvr_state.vr[0], otherwise we might
> +	 * recheckpoint the wrong live value.
> +	 */
> +	lxvx	vs32, 0, r7
> +	xxswapd	vs32, vs32

Yes good find!

I feel like the crux of the problem is that we don't always
tm_recheckpoint() with the same msr as we tm_reclaimed() with, is THIS
the core of the problem? In the traps.c case it is an optimisation,
perhaps a pointless one, if I had spare time I would benchmark if it is
worth it.

This code absolutely can't hurt and does solve a real problem.

Perhaps use the macro, I think what you want is:

/* r0 evaluates to literal zero pp 489 ISA 3.0 */
LXVD2X_ROT(32, r0, r7)

> +
>  dont_backup_vec:
>  	mfspr	r0, SPRN_VRSAVE
>  	std	r0, THREAD_CKVRSAVE(r3)
> @@ -272,9 +281,16 @@ dont_backup_vec:
>  	addi	r7, r3, THREAD_CKFPSTATE
>  	SAVE_32FPRS_VSRS(0, R6, R7)	/* r6 scratch, r7 transact fp state */
>  
> +	/* Corrupting fr0 (vs0). Should restore it later. */
>  	mffs    fr0
>  	stfd    fr0,FPSTATE_FPSCR(r7)
>  
> +	/* Restore fr0 (vs32) from ckfp_state.fpr[0], otherwise we might
> +	 * recheckpoint the wrong live value.
> +	 */
> +	lxvx	vs0, 0, r7
> +	xxswapd vs0, vs0

/* r0 evaluates to literal zero pp 489 ISA 3.0 */
LXVD2X_ROT(0, r0, r7)


> +
>  dont_backup_fp:
>  
>  	/* TM regs, incl TEXASR -- these live in thread_struct.  Note they've


More information about the Linuxppc-dev mailing list