[PATCH] powerpc/mm/hash: Hand user access of kernel address gracefully

Thu Dec 13 13:15:23 AEDT 2018

Breno Leitao <leitao at debian.org> writes:

> hi Aneesh,
>
> On 11/26/18 12:35 PM, Aneesh Kumar K.V wrote:
>> With commit 2865d08dd9ea ("powerpc/mm: Move the DSISR_PROTFAULT sanity check")
>> we moved the protection fault access check before vma lookup. That means we
>> hit that WARN_ON when user space access a kernel address.  Before the commit
>> this was handled by find_vma() not finding vma for the kernel address and
>> considering that access as bad area access.
>> 
>> Avoid the confusing WARN_ON and convert that to a ratelimited printk.
>> With the patch we now get
>> 
>> for load:
>> [  187.700294] a.out[5997]: User access of kernel address (c00000000000dea0) - exploit attempt? (uid: 1000)
>> [  187.700344] a.out[5997]: segfault (11) at c00000000000dea0 nip 1317c0798 lr 7fff80d6441c code 1 in a.out[1317c0000+10000]
>> [  187.700429] a.out[5997]: code: 60000000 60420000 3c4c0002 38427790 4bffff20 3c4c0002 38427784 fbe1fff8
>> [  187.700435] a.out[5997]: code: f821ffc1 7c3f0b78 60000000 e9228030 <89290000> 993f002f 60000000 383f0040
>> 
>> for exec:
>> [  225.100903] a.out[6067]: User access of kernel address (c00000000000dea0) - exploit attempt? (uid: 1000)
>> [  225.100938] a.out[6067]: segfault (11) at c00000000000dea0 nip c00000000000dea0 lr 129d507b0 code 1
>> [  225.100943] a.out[6067]: Bad NIP, not dumping instructions.
>> 
>> Fixes: 2865d08dd9ea ("powerpc/mm: Move the DSISR_PROTFAULT sanity check")
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
>
> Tested-by: Breno Leitao <leitao at debian.org>

Thanks.

>> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
>> index 1697e903bbf2..46f280068c45 100644
>> --- a/arch/powerpc/mm/fault.c
>> +++ b/arch/powerpc/mm/fault.c
>> @@ -342,8 +342,21 @@ static inline void cmo_account_page_fault(void) { }
>>  #endif /* CONFIG_PPC_SMLPAR */
>>  
>>  #ifdef CONFIG_PPC_STD_MMU
>> -static void sanity_check_fault(bool is_write, unsigned long error_code)
>> +static void sanity_check_fault(bool is_write, bool is_user,
>> +			       unsigned long error_code, unsigned long address)
>>  {
>> +	/*
>> +	 * userspace trying to access kernel address, we get PROTFAULT for that.
>> +	 */
>> +	if (is_user && address >= TASK_SIZE) {
>> +		printk_ratelimited(KERN_CRIT "%s[%d]: "
>> +				   "User access of kernel address (%lx) - "
>> +				   "exploit attempt? (uid: %d)\n",
>> +				   current->comm, current->pid, address,
>> +				   from_kuid(&init_user_ns, current_uid()));
>> +		return;
>
> Silly question: Is it OK to printk() and just return here? __do_page_fault
> will continue to execute independently of this return, right?

Yeah it is OK to just return.

I agree it's a bit of a strange way for the code to be structured, ie.
we detect a bad condition and print about it and then just return and
let it continue anyway.

I guess it's that way because it was added as an additional check, ie.
the code already handled those cases further down, but this was a check
in case anything weird happened.

If you look at the start of __do_page_fault() we have three separate
checks:

	if (unlikely(page_fault_is_bad(error_code))) {
		if (is_user) {
			_exception(SIGBUS, regs, BUS_OBJERR, address);
			return 0;
		}
		return SIGBUS;
	}

	/* Additional sanity check(s) */
	sanity_check_fault(is_write, is_user, error_code, address);

	/*
	 * The kernel should never take an execute fault nor should it
	 * take a page fault to a kernel address.
	 */
	if (unlikely(!is_user && bad_kernel_fault(is_exec, error_code, address)))
		return SIGSEGV;

It seems like maybe we could simplify that somewhat.

We need to be careful though that we return the right signal (SEGV or
BUS), and also that user faults get counted (see PERF_COUNT_SW_PAGE_FAULTS).

So it's not straight forward as usual :)

cheers