[PATCH] powerpc/livepatch: Fix livepatch stack access
Kamalesh Babulal
kamalesh at linux.vnet.ibm.com
Thu Sep 21 21:53:50 AEST 2017
On Thursday 21 September 2017 04:30 PM, Balbir Singh wrote:
> On Thu, Sep 21, 2017 at 8:02 PM, Michael Ellerman <mpe at ellerman.id.au> wrote:
>> Kamalesh Babulal <kamalesh at linux.vnet.ibm.com> writes:
>>
>>> While running stress test with livepatch module loaded, kernel
>>> bug was triggered.
>>>
>>> cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
>>> pc: c0000000eb9d3e30
>>> lr: c0000000eb9d3e30
>>> sp: c0000000eb9d3de0
>>> msr: 800000001280b033
>>> current = 0xc0000000dbd38700
>>> paca = 0xc00000000fe01400 softe: 0 irq_happened: 0x01
>>> pid = 8618, comm = make
>>> Linux version 4.13.0+ (root at ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017
>>>
>>> 5:mon> t
>>> [c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
>>> [c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
>>> --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
>>> [c0000000eb9d4120] 0000000057ac6e9d (unreliable)
>>> [d0000000089d9f78] 2e0965747962382e
>>> SP (965747962342e09) is in userspace
>>>
>>> When an interrupt is served in between the livepatch_handler execution,
>>> there are chances of the livepatch_stack/task task getting corrupted.
>>
>> Ouch. That's pretty broken by me.
>>
>
> I was worried more about preemption as I said in the review comment earlier,
> this is new. It looks like we restored the wrong r1 on returning from
> the interrupt
> context?
Yes, consider the example in the mail thread. Where the r0 is temporary
used to hold the stack pointer value before loading r1 with the
livepatch_sp (current->thread_info + 24). An interrupt occurs while the
livepatch stack is being setup to store or restore the LR/TOC value of
the calling function.
do_IRQ -> call_do_irq's first instruction mflr r0, over-writes the stack
value with LR value. Consecutive instruction referring to r1 in
call_do_irq are actually referring to the livepatch_sp instead of task
stack. On the return to livepatch_handler, stack is restored from r0
value (which has been over-written with LR value in call_do_irq). Any
access therefore made to the stack is to a wrong address.
> It would be nice to see any pt_regs changes due to the interrupt.
> Did the interrupt handling code called something that needed live-patching?
>
AFAIK, the livepatch_sp value for softirq_ctx/hardirq_ctx are
initialized, as it's part of thread_info. Am I missing something here.
>
>>> Fix the corruption by using r11 register for livepatch stack manipulation,
>>> instead of shuffling task stack and livepatch stack into r1 register.
>>> Using r11 register also avoids disabling/enabling irq's while setting
>>> up the livepatch stack.
>>
The r11 is restored before calling the livepatch_handler by ftrace_caller:
/* Load CTR with the possibly modified NIP */
mtctr r15
/* Restore gprs */
REST_GPR(0,r1)
REST_10GPRS(2,r1)
REST_10GPRS(12,r1)
REST_10GPRS(22,r1)
[...]
#ifdef CONFIG_LIVEPATCH
/*
* Based on the cmpd or cmpdi above, if the NIP was altered and
we're
* not on a kprobe/jprobe, then handle livepatch.
*/
bne- livepatch_handler
#endif
--
cheers,
Kamalesh.
More information about the Linuxppc-dev
mailing list