[PATCH] powerpc: Save AMR/IAMR when switching tasks

Samuel Holland samuel at sholland.org
Wed Oct 26 15:27:35 AEDT 2022


On 9/21/22 00:17, Christophe Leroy wrote:
> Le 21/09/2022 à 05:33, Samuel Holland a écrit :
>> On 9/19/22 07:37, Michael Ellerman wrote:
>>> Christophe Leroy <christophe.leroy at csgroup.eu> writes:
>>>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>>>> with userspace access enabled until after the next IRQ or privilege
>>>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>>>> switching back to the original task, the userspace access would fault:
>>>>
>>>> This is not supposed to happen. You never switch away from a task
>>>> magically. Task switch will always happen in an interrupt, that means
>>>> copy_{from,to}_user() get interrupted.
>>>
>>> Unfortunately this isn't true when CONFIG_PREEMPT=y.
>>>
>>> We can switch away without an interrupt via:
>>>    __copy_tofrom_user()
>>>      -> __copy_tofrom_user_power7()
>>>         -> exit_vmx_usercopy()
>>>            -> preempt_enable()
>>>               -> __preempt_schedule()
>>>                  -> preempt_schedule()
>>>                     -> preempt_schedule_common()
>>>                        -> __schedule()
>>>
>>> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
>>> all on Power8, which is a bit of an oversight on my part.
>>>
>>> And clearly no one else tests it, until now :)
>>>
>>> I think the root of our problem is that our KUAP lock/unlock is at too
>>> high a level, ie. we do it in C around the low-level copy to/from.
>>>
>>> eg:
>>>
>>> static inline unsigned long
>>> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
>>> {
>>> 	unsigned long ret;
>>>
>>> 	allow_write_to_user(to, n);
>>> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
>>> 	prevent_write_to_user(to, n);
>>> 	return ret;
>>> }
>>>
>>> There's a reason we did that, which is that we have various different
>>> KUAP methods on different platforms, not a simple instruction like other
>>> arches.
>>>
>>> But that means we have that exit_vmx_usercopy() being called deep in the
>>> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
>>> the preempt machinery and eventually schedule.
>>>
>>> I don't see an easy way to fix that "properly", it would be a big change
>>> to all platforms to push the KUAP save/restore down into the low level
>>> asm code.
>>>
>>> But I think the patch below does fix it, although it abuses things a
>>> little. Namely it only works because the 64s KUAP code can handle a
>>> double call to prevent, and doesn't need the addresses or size for the
>>> allow.
>>>
>>> Still I think it might be our best option for an easy fix.
>>>
>>> Samuel, can you try this on your system and check it works for you?
>>
>> It looks like your patch works. Thanks for the correct fix!
> 
> Instead of the patch from Michael, could you try by replacing 
> preempt_enable() by preempt_enable_no_resched() in exit_vmx_usercopy() ?

I finally got a chance to test this, and the simpler fix of using
preempt_enable_no_resched() works as well.

>> I replaced my patch with the one below, and enabled
>> CONFIG_PPC_KUAP_DEBUG=y, and I was able to do several kernel builds
>> without any crashes or splats in dmesg.
> 
> Did you try CONFIG_PPC_KUAP_DEBUG without the patch ? Did it detect any 
> problem ?

I believe I did at one point, and it did not detect anything.

Regards,
Samuel



More information about the Linuxppc-dev mailing list