[PATCH] powerpc/tm: Set MSR[TS] just prior to recheckpoint

Breno Leitao leitao at debian.org
Thu Nov 22 23:46:05 AEDT 2018


Hi Mikey,

On 11/21/18 8:42 PM, Michael Neuling wrote:
>> Do you mean in this part of code?
>>
>>   SYSCALL_DEFINE0(rt_sigreturn)
>>   {
>> 	....
>>         if (__copy_from_user(&set, &uc->uc_sigmask, sizeof(set)))
>>                 goto badframe;
>>
>> 	...
>>         if (MSR_TM_SUSPENDED(mfmsr()))
>>                 tm_reclaim_current(0);
> 
> I'm actually thinking after the reclaim, not before.
> 
> If I follow your original email properly, you have a problem because you end up
> in this senario:
> 1) Current MSR is not TM suspended
> 2) regs->msr[TS] set
> 3) get_user() (which may fault)

In fact you need another case, where TEXASR register (the live register) does
not contain FS bit set. So, the current flow is:

 1) Current MSR is not TM suspended
 2) regs->msr[TS] set
 2a) TEXASR[FS] = 0
3) get_user() (which may fault)

In this case, the page fault will call SCHEDULE, which will call
__switch_to_tm().

__switch_to_tm() will call tm_reclaim_task(), which does:

	static inline void tm_reclaim_task(struct task_struct *tsk)
	{
		...
	        tm_reclaim_thread(thr, TM_CAUSE_RESCHED);
		...
	        tm_save_sprs(thr);
	}

So, the code above is executed at page fault with the scenario you described
(current MSR is not suspended, regs->msr[TS] set and current TEXASR = 0).

That said, tm_reclaim_task() will invoke tm_reclaim_thread() which will
return due to:

	        if (!MSR_TM_SUSPENDED(mfmsr()))
	                return;

Calling tm_save_sprs(thread), which does:

	_GLOBAL(tm_save_sprs)
     		mfspr   r0, SPRN_TEXASR		 <- TEXASR is 0 here
      		std     r0, THREAD_TM_TEXASR(r3) <- thr->texasr will be 0


In this case, we have a process that was de-schedule properly but has
regs->msr[TS] set and Thread->texasr[FS] = 0. (If current MSR[TS] was set,
then the reclaim process would set the live TEXASR[FS] for us, but it didn't
happen, since MSR_TM_SUSPENDED(mfmsr()) was false.)

When this process is scheduled back, then it breaks. It will do call the
following chain:

__switch_to_tm() -> tm_recheckpoint_new_task() -> tm_recheckpoint() which does:

	void tm_recheckpoint(struct thread_struct *thread)
	{
		...
	        tm_restore_sprs(thread); 	
        	__tm_recheckpoint(thread);
	}

In this case, __tm-recheckpoint() is called with current TEXASR[FS] = 0,
hitting that bug.


> After the tm_reclaim there are cases in restore_tm_sigcontexts() where the
> above
> is also the case. Hence why I think we have a problem there too

Right, but in order to meet the criteria, you need to *fully* execute
tm_reclaim() (i.e execute the TRECLAIM instruction), so, thread->texasr[FS]
will be set.

So, at the entrance level, you either have current MSR[TS] set and fully
reclaim, thus setting texasr[FS], or, you will not have the regs->msr[TS] set
*and* current MSR[TS] disabled (until later where this patch fixes the problem).

Anyway, I might be missing something. So, the root of the problem seem to be
related to creating a case where current MSR[TS] is not set but regs->msr[TS]
is set.





More information about the Linuxppc-dev mailing list