BUG: KASAN: stack-out-of-bounds

Andrey Ryabinin aryabinin at virtuozzo.com
Thu Feb 28 20:22:53 AEDT 2019



On 2/27/19 4:11 PM, Christophe Leroy wrote:
> 
> 
> Le 27/02/2019 à 10:19, Andrey Ryabinin a écrit :
>>
>>
>> On 2/27/19 11:25 AM, Christophe Leroy wrote:
>>> With version v8 of the series implementing KASAN on 32 bits powerpc (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=94309), I'm now able to activate KASAN on a mac99 is QEMU.
>>>
>>> Then I get the following reports at startup. Which of the two reports I get seems to depend on the option used to build the kernel, but for a given kernel I always get the same report.
>>>
>>> Is that a real bug, in which case how could I spot it ? Or is it something wrong in my implementation of KASAN ?
>>>
>>> I checked that after kasan_init(), the entire shadow memory is full of 0 only.
>>>
>>> I also made a try with the strong STACK_PROTECTOR compiled in, but no difference and nothing detected by the stack protector.
>>>
>>> ==================================================================
>>> BUG: KASAN: stack-out-of-bounds in memchr+0x24/0x74
>>> Read of size 1 at addr c0ecdd40 by task swapper/0
>>>
>>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1133
>>> Call Trace:
>>> [c0e9dca0] [c01c42a0] print_address_description+0x64/0x2bc (unreliable)
>>> [c0e9dcd0] [c01c4684] kasan_report+0xfc/0x180
>>> [c0e9dd10] [c089579c] memchr+0x24/0x74
>>> [c0e9dd30] [c00a9e38] msg_print_text+0x124/0x574
>>> [c0e9dde0] [c00ab710] console_unlock+0x114/0x4f8
>>> [c0e9de40] [c00adc60] vprintk_emit+0x188/0x1c4
>>> --- interrupt: c0e9df00 at 0x400f330
>>>      LR = init_stack+0x1f00/0x2000
>>> [c0e9de80] [c00ae3c4] printk+0xa8/0xcc (unreliable)
>>> [c0e9df20] [c0c28e44] early_irq_init+0x38/0x108
>>> [c0e9df50] [c0c16434] start_kernel+0x310/0x488
>>> [c0e9dff0] [00003484] 0x3484
>>>
>>> The buggy address belongs to the variable:
>>>   __log_buf+0xec0/0x4020
>>> The buggy address belongs to the page:
>>> page:c6eac9a0 count:1 mapcount:0 mapping:00000000 index:0x0
>>> flags: 0x1000(reserved)
>>> raw: 00001000 c6eac9a4 c6eac9a4 00000000 00000000 00000000 ffffffff 00000001
>>> page dumped because: kasan: bad access detected
>>>
>>> Memory state around the buggy address:
>>>   c0ecdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>   c0ecdc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>>> c0ecdd00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>>>                                     ^
>>>   c0ecdd80: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>>>   c0ecde00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> ==================================================================
>>>
>>
>> This one doesn't look good. Notice that it says stack-out-of-bounds, but at the same time there is
>>     "The buggy address belongs to the variable:  __log_buf+0xec0/0x4020"
>>   which is printed by following code:
>>     if (kernel_or_module_addr(addr) && !init_task_stack_addr(addr)) {
>>         pr_err("The buggy address belongs to the variable:\n");
>>         pr_err(" %pS\n", addr);
>>     }
>>
>> So the stack unrelated address got stack-related poisoning. This could be a stack overflow, did you increase THREAD_SHIFT?
>> KASAN with stack instrumentation significantly increases stack usage.
>>
> 
> I get the above with THREAD_SHIFT set to 13 (default value).
> If increasing it to 14, I get the following instead. That means that in that case the problem arises a lot earlier in the boot process (but still after the final kasan shadow setup).
> 

We usually use 15 (with 4k pages), but I think 14 should be enough for the clean boot.

> ==================================================================
> BUG: KASAN: stack-out-of-bounds in pmac_nvram_init+0x1f8/0x5d0
> Read of size 1 at addr f6f37de0 by task swapper/0
> 
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc7+ #1143
> Call Trace:
> [c0e9fd60] [c01c43c0] print_address_description+0x164/0x2bc (unreliable)
> [c0e9fd90] [c01c46a4] kasan_report+0xfc/0x180
> [c0e9fdd0] [c0c226d4] pmac_nvram_init+0x1f8/0x5d0
> [c0e9fef0] [c0c1f73c] pmac_setup_arch+0x298/0x314
> [c0e9ff20] [c0c1ac40] setup_arch+0x250/0x268
> [c0e9ff50] [c0c151dc] start_kernel+0xb8/0x488
> [c0e9fff0] [00003484] 0x3484
> 
> 
> Memory state around the buggy address:
>  f6f37c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  f6f37d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>f6f37d80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
>                                                ^
>  f6f37e00: 00 00 01 f4 f2 f2 f2 f2 00 00 00 00 f2 f2 f2 f2
>  f6f37e80: 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00 00 00
> ==================================================================

Powerpc's show_stack() prints stack addresses, so we know that stack is something near 0xc0e9f... address.
f6f37de0 is definitely not stack address and it's to far for the stack overflow.
So it looks like shadow for stack  - kasan_mem_to_shadow(0xc0e9f...) and shadow for address in report - kasan_mem_to_shadow(0xf6f37de0)
point to the same physical page. 


More information about the Linuxppc-dev mailing list