[PATCH 5/6] cxlflash: Resolve oops in wait_port_offline

Uma Krishnan ukrishn at linux.vnet.ibm.com
Fri Dec 18 09:30:23 AEDT 2015


On 12/10/2015 4:54 PM, Uma Krishnan wrote:
> From: Manoj Kumar <manoj at linux.vnet.ibm.com>
>
> If an async error interrupt is generated, and the error requires the FC
> link to be reset, it cannot be performed in the interrupt context. So
> a work element is scheduled to complete the link reset in a process
> context. If either an EEH event or an escalation occurs in between
> when the interrupt is generated and the scheduled work is started, the
> MMIO space may no longer be available. This will cause an oops in the
> worker thread.
>
> [  606.806583] NIP kthread_data+0x28/0x40
> [  606.806633] LR wq_worker_sleeping+0x30/0x100
> [  606.806694] Call Trace:
> [  606.806721] 0x50 (unreliable)
> [  606.806796] wq_worker_sleeping+0x30/0x100
> [  606.806884] __schedule+0x69c/0x8a0
> [  606.806959] schedule+0x44/0xc0
> [  606.807034] do_exit+0x770/0xb90
> [  606.807109] die+0x300/0x460
> [  606.807185] bad_page_fault+0xd8/0x150
> [  606.807259] handle_page_fault+0x2c/0x30
> [  606.807338] wait_port_offline.constprop.12+0x60/0x130 [cxlflash]
>
> To prevent the problem space area from being unmapped, when there is
> pending work, a mapcount (using the kref mechanism) is held.  The mapcount
> is released only when the work is completed.  The last reference release
> is tied to the unmapping service.
>
> Signed-off-by: Manoj N. Kumar <manoj at linux.vnet.ibm.com>
> ---

Reviewed-by: Uma Krishnan <ukrishn at linux.vnet.ibm.com>



More information about the Linuxppc-dev mailing list