[PATCH] powerpc/tm: Fix userspace r13 corruption
Breno Leitao
leitao at debian.org
Tue Sep 25 00:32:48 AEST 2018
Hi Mikey,
On 09/24/2018 04:27 AM, Michael Neuling wrote:
> When we treclaim we store the userspace checkpointed r13 to a scratch
> SPR and then later save the scratch SPR to the user thread struct.
>
> Unfortunately, this doesn't work as accessing the user thread struct
> can take an SLB fault and the SLB fault handler will write the same
> scratch SPRG that now contains the userspace r13.
>
> To fix this, we store r13 to the kernel stack (which can't fault)
> before we access the user thread struct.
>
> Found by running P8 guest + powervm + disable_1tb_segments + TM. Seen
> as a random userspace segfault with r13 looking like a kernel address.
>
> Signed-off-by: Michael Neuling <mikey at neuling.org>
Reviewed-by: Breno Leitao <leitao at debian.org>
I am wondering if the same problem could not happen with r1 as well, since r1
is kept in the paca->tm_scratch after MSR[RI] is enabled, in a code similar
to this:
std r1, PACATMSCRATCH(r13)
..
li r11, MSR_RI
mtmsrd r11, 1
....
ld r3, PACATMSCRATCH(r13) /* user r1 */
std r3, GPR1(r7)
There might be a corruption if the exception that is raised just after
MSR[RI] is enabled somehow exits through 'fast_exception_return' (being
called by most of the exception handlers as I understand) which will
overwrite paca->tm_scratch with the upcoming MSR (aka SRR1) just before the
SRR1, as:
ld r3,_MSR(r1)
...
std r3, PACATMSCRATCH(r13) /* Stash returned-to MSR */
It seems that slb_miss_common and instruction_access_slb does not go through
this path, but handle_page_fault calls
ret_from_except_lite->fast_exception_return, which might cause this
'possible' issue.
Anyway, I would like to investigate if this problem might happen or not for
real, but I do not fully understand what are the exceptions that might be
raised when just MSR[RI] is enabled.
More information about the Linuxppc-dev
mailing list