[RFC PATCH 05/14] powerpc/tm: Refactor the __switch_to_tm code

Thu Nov 15 12:04:20 AEDT 2018

On Tue, 2018-11-06 at 10:40 -0200, Breno Leitao wrote:
> __switch_to_tm is the function that switches between two tasks which might
> have TM enabled. This function is clearly split in two parts, the task that
> is leaving the CPU, known as 'prev' and the task that is being scheduled,
> known as 'new'.
> 
> It starts checking if the previous task had TM enable, if so, it increases
> the load_tm (this is the only place we increment load_tm). It also saves
> the TM SPRs here.
> 
> If the previous task was scheduled out with a transaction active, the
> failure cause needs to be updated, since it might contain the failure cause
> that caused the exception, as TM_CAUSE_MISC. In this case, since there was
> a context switch, overwrite the failure cause.
> 
> If the previous task has overflowed load_tm, disable TM, putting the
> facility save/restore lazy mechanism at lazy mode.
> 
> Regarding the 'new' task being scheduled, restoring TM SPRs is enough if
> the task had TM enabled when it was de-scheduled. (Checking if a
> recheckpoint would be required will be done later, at restore_tm_state()
> stage.)
> 
> On top of that, both tm_reclaim_task() and tm_recheckpoint_new_task()
> functions are not used anymore, removing them.

Is the above describing the previous functionality or the refactored
functionality?

> 
> Signed-off-by: Breno Leitao <leitao at debian.org>
> ---
>  arch/powerpc/kernel/process.c | 167 ++++++++++++++++------------------
>  1 file changed, 78 insertions(+), 89 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 1842fd96b123..73872f751b33 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -912,48 +912,6 @@ void tm_reclaim_current(uint8_t cause)
>  	tm_reclaim_thread(&current->thread, cause);
>  }
>  
> -static inline void tm_reclaim_task(struct task_struct *tsk)
> -{
> -	/* We have to work out if we're switching from/to a task that's in the
> -	 * middle of a transaction.
> -	 *
> -	 * In switching we need to maintain a 2nd register state as
> -	 * oldtask->thread.ckpt_regs.  We tm_reclaim(oldproc); this saves the
> -	 * checkpointed (tbegin) state in ckpt_regs, ckfp_state and
> -	 * ckvr_state
> -	 *
> -	 * We also context switch (save) TFHAR/TEXASR/TFIAR in here.
> -	 */
> -	struct thread_struct *thr = &tsk->thread;
> -
> -	if (!thr->regs)
> -		return;
> -
> -	if (!MSR_TM_ACTIVE(thr->regs->msr))
> -		goto out_and_saveregs;
> -
> -	WARN_ON(tm_suspend_disabled);
> -
> -	TM_DEBUG("--- tm_reclaim on pid %d (NIP=%lx, "
> -		 "ccr=%lx, msr=%lx, trap=%lx)\n",
> -		 tsk->pid, thr->regs->nip,
> -		 thr->regs->ccr, thr->regs->msr,
> -		 thr->regs->trap);
> -
> -	tm_reclaim_thread(thr, TM_CAUSE_RESCHED);
> -
> -	TM_DEBUG("--- tm_reclaim on pid %d complete\n",
> -		 tsk->pid);
> -
> -out_and_saveregs:
> -	/* Always save the regs here, even if a transaction's not active.
> -	 * This context-switches a thread's TM info SPRs.  We do it here to
> -	 * be consistent with the restore path (in recheckpoint) which
> -	 * cannot happen later in _switch().
> -	 */
> -	tm_save_sprs(thr);
> -}
> -
>  extern void __tm_recheckpoint(struct thread_struct *thread);
>  
>  void tm_recheckpoint(struct thread_struct *thread)
> @@ -980,59 +938,91 @@ void tm_recheckpoint(struct thread_struct *thread)
>  	local_irq_restore(flags);
>  }
>  
> -static inline void tm_recheckpoint_new_task(struct task_struct *new)
> +static void tm_change_failure_cause(struct task_struct *task, uint8_t cause)
>  {
> -	if (!cpu_has_feature(CPU_FTR_TM))
> -		return;
> -
> -	/* Recheckpoint the registers of the thread we're about to switch to.
> -	 *
> -	 * If the task was using FP, we non-lazily reload both the original and
> -	 * the speculative FP register states.  This is because the kernel
> -	 * doesn't see if/when a TM rollback occurs, so if we take an FP
> -	 * unavailable later, we are unable to determine which set of FP regs
> -	 * need to be restored.
> -	 */
> -	if (!tm_enabled(new))
> -		return;
> -
> -	if (!MSR_TM_ACTIVE(new->thread.regs->msr)){
> -		tm_restore_sprs(&new->thread);
> -		return;
> -	}
> -	/* Recheckpoint to restore original checkpointed register state. */
> -	TM_DEBUG("*** tm_recheckpoint of pid %d (new->msr 0x%lx)\n",
> -		 new->pid, new->thread.regs->msr);
> -
> -	tm_recheckpoint(&new->thread);
> -
> -	/*
> -	 * The checkpointed state has been restored but the live state has
> -	 * not, ensure all the math functionality is turned off to trigger
> -	 * restore_math() to reload.
> -	 */
> -	new->thread.regs->msr &= ~(MSR_FP | MSR_VEC | MSR_VSX);
> -
> -	TM_DEBUG("*** tm_recheckpoint of pid %d complete "
> -		 "(kernel msr 0x%lx)\n",
> -		 new->pid, mfmsr());
> +	task->thread.tm_texasr &= ~TEXASR_FC;
> +	task->thread.tm_texasr |= (unsigned long) cause << TEXASR_FC_LG;
>  }
>  
>  static inline void __switch_to_tm(struct task_struct *prev,
>  		struct task_struct *new)
>  {
> -	if (cpu_has_feature(CPU_FTR_TM)) {
> -		if (tm_enabled(prev) || tm_enabled(new))
> -			tm_enable();
> +	if (!cpu_has_feature(CPU_FTR_TM))
> +		return;
> +
> +	/* The task leaving the CPU was using TM, let's handle it */
> +	if (tm_enabled(prev)) {
> +		/*
> +		 * Load_tm is incremented only when the task is scheduled out
> +		 */
> +		prev->thread.load_tm++;
> +
> +		/*
> +		 * If TM is enabled for the thread, it needs to, at least,
> +		 * save the SPRs
> +		 */
> +		tm_enable();
> +		tm_save_sprs(&prev->thread);
> +
> +		/*
> +		 * If the task being descheduled had an active transaction
> +		 * during the exception that brought it here, the
> +		 * transaction was reclaimed by TM_KERNEL_ENTRY.
> +		 *
> +		 * Independently of the TEXASR failure code set at
> +		 * TM_KERNEL_ENTRY time, the task is being descheduled, and
> +		 * the failure code needs to be updated to reflect it.
> +		 */
> +		if (MSR_TM_ACTIVE(prev->thread.regs->msr)) {
> +			/*
> +			 * If there was an IRQ during trecheckpoint, it will
> +			 * cause an IRQ to be replayed. This replayed IRQ can
> +			 * invoke SCHEDULE_USER, thus, we arrive here with a TM
> +			 * active transaction.
> +			 * I.e, the task was leaving kernelspace to userspace,
> +			 * already trecheckpointed, but there was a IRQ during
> +			 * the trecheckpoint process (soft irq disabled), and
> +			 * on the IRQ replay, the process was de-scheduled, so,
> +			 * SCHEDULE_USER was called and here we are.
> +			 *
> +			 */
> +			if (WARN_ON(MSR_TM_ACTIVE(mfmsr())))
> +				tm_reclaim_current(TM_CAUSE_RESCHED);

Do we need the MSR_TM_ACTIVE() check gating this? if MSR_TM_ACTIVE(mfmsr())
after TM_KERNEL_ENTRY, something is pretty broken. (in fact you have this check
later... so just remove this one I think).

>  
> -		if (tm_enabled(prev)) {
> -			prev->thread.load_tm++;
> -			tm_reclaim_task(prev);
> -			if (!MSR_TM_ACTIVE(prev->thread.regs->msr) && prev->thread.load_tm == 0)
> +			/*
> +			 * If rescheduled with TM active, update the
> +			 * failure cause
> +			 */
> +			tm_change_failure_cause(prev, TM_CAUSE_RESCHED);
> +		} else {

else if to avoid more indenting...

> +			/*
> +			 * TM enabled but not transactional. Just disable TM
> +			 * if load_tm overflows. This should be the only place
> +			 * that disables the TM and reenables the laziness
> +			 * save/restore
> +			 */
> +			if (prev->thread.load_tm == 0)
>  				prev->thread.regs->msr &= ~MSR_TM;
>  		}
> +	}
>  
> -		tm_recheckpoint_new_task(new);
> +	/*
> +	 * It is a *bug* if we arrived so late with a transaction active
> +	 * (more precisely suspended)
> +	 */
> +	if (WARN_ON(MSR_TM_ACTIVE(mfmsr()))) {
> +		/* Recovery path. 0x99 shouldn't be exported to UAPI */
> +		tm_reclaim_current(0x99);
> +	}

As discussed privately, if we hit this case, either we BUG_ON() or we WARN_ON()
AND kill the process. Something is bust in the kernel at this point so we need
to do something more than just warning and limp along with a reclaim.

Also, no {} needed for single line if statements.

> +
> +	/*
> +	 * If the next task has TM enabled, restore the SPRs. Do not need to
> +	 * care about recheckpoint at this time. It will be done later if
> +	 * TIF_RESTORE_TM was set when the task was scheduled out
> +	 */
> +	if (tm_enabled(new)) {
> +		tm_enable();
> +		tm_restore_sprs(&new->thread);
>  	}
>  }
>  
> @@ -1094,7 +1084,6 @@ void restore_tm_state(struct pt_regs *regs)
>  }
>  
>  #else
> -#define tm_recheckpoint_new_task(new)
>  #define __switch_to_tm(prev, new)
>  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>  
> @@ -1599,9 +1588,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
>  	/*
>  	 * Flush TM state out so we can copy it.  __switch_to_tm() does this
>  	 * flush but it removes the checkpointed state from the current CPU and
> -	 * transitions the CPU out of TM mode.  Hence we need to call
> -	 * tm_recheckpoint_new_task() (on the same task) to restore the
> -	 * checkpointed state back and the TM mode.
> +	 * transitions the CPU out of TM mode.  Hence we need to make sure
> +	 * TIF_RESTORE_TM is set so restore_tm_state is called to restore the
> +	 * checkpointed state and back to TM mode.
>  	 *
>  	 * Can't pass dst because it isn't ready. Doesn't matter, passing
>  	 * dst is only important for __switch_to()