[PATCH] Powerpc/Watchpoint: Restore nvgprs while returning from exception
Ravi Bangoria
ravi.bangoria at linux.ibm.com
Fri Jun 7 16:26:39 AEST 2019
On 6/7/19 11:20 AM, Michael Ellerman wrote:
> Ravi Bangoria <ravi.bangoria at linux.ibm.com> writes:
>
>> Powerpc hw triggers watchpoint before executing the instruction.
>> To make trigger-after-execute behavior, kernel emulates the
>> instruction. If the instruction is 'load something into non-
>> volatile register', exception handler should restore emulated
>> register state while returning back, otherwise there will be
>> register state corruption. Ex, Adding a watchpoint on a list
>> can corrput the list:
>>
>> # cat /proc/kallsyms | grep kthread_create_list
>> c00000000121c8b8 d kthread_create_list
>>
>> Add watchpoint on kthread_create_list->next:
>>
>> # perf record -e mem:0xc00000000121c8c0
>>
>> Run some workload such that new kthread gets invoked. Ex, I
>> just logged out from console:
>>
>> list_add corruption. next->prev should be prev (c000000001214e00), \
>> but was c00000000121c8b8. (next=c00000000121c8b8).
>> WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0
>> CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69
>> ...
>> NIP __list_add_valid+0xb4/0xc0
>> LR __list_add_valid+0xb0/0xc0
>> Call Trace:
>> __list_add_valid+0xb0/0xc0 (unreliable)
>> __kthread_create_on_node+0xe0/0x260
>> kthread_create_on_node+0x34/0x50
>> create_worker+0xe8/0x260
>> worker_thread+0x444/0x560
>> kthread+0x160/0x1a0
>> ret_from_kernel_thread+0x5c/0x70
>
> This all depends on what code the compiler generates for the list
> access.
True. list corruption is just an example. But any load instruction that uses
non-volatile register and hits a watchpoint, will result in register state
corruption.
> Can you include a disassembly of the relevant code in your
> kernel so we have an example of the bad case.
Register state from WARN_ON():
GPR00: c00000000059a3a0 c000007ff23afb50 c000000001344e00 0000000000000075
GPR04: 0000000000000000 0000000000000000 0000001852af8bc1 0000000000000000
GPR08: 0000000000000001 0000000000000007 0000000000000006 00000000000004aa
GPR12: 0000000000000000 c000007ffffeb080 c000000000137038 c000005ff62aaa00
GPR16: 0000000000000000 0000000000000000 c000007fffbe7600 c000007fffbe7370
GPR20: c000007fffbe7320 c000007fffbe7300 c000000001373a00 0000000000000000
GPR24: fffffffffffffef7 c00000000012e320 c000007ff23afcb0 c000000000cb8628
GPR28: c00000000121c8b8 c000000001214e00 c000007fef5b17e8 c000007fef5b17c0
Snippet from __kthread_create_on_node:
c000000000136be8: ed ff a2 3f addis r29,r2,-19
c000000000136bec: c0 7a bd eb ld r29,31424(r29)
if (!__list_add_valid(new, prev, next))
c000000000136bf0: 78 f3 c3 7f mr r3,r30
c000000000136bf4: 78 e3 85 7f mr r5,r28
c000000000136bf8: 78 eb a4 7f mr r4,r29
c000000000136bfc: fd 36 46 48 bl c00000000059a2f8 <__list_add_valid+0x8>
Watchpoint hit at 0xc000000000136bec.
addis r29,r2,-19
=> r29 = 0xc000000001344e00 + (-19 << 16)
=> r29 = 0xc000000001214e00
ld r29,31424(r29)
=> r29 = *(0xc000000001214e00 + 31424)
=> r29 = *(0xc00000000121c8c0)
0xc00000000121c8c0 is where we placed a watchpoint and thus this instruction was
emulated by emulate_step. But because handle_dabr_fault did not restore emulated
register state, r29 still contains stale value in above register state.
More information about the Linuxppc-dev
mailing list