[PATCH v3 40/41] cxlflash: Remove commmands from pending list on timeout
Matthew R. Ochs
mrochs at linux.vnet.ibm.com
Thu Mar 29 01:50:32 AEDT 2018
On Mon, Mar 26, 2018 at 11:35:34AM -0500, Uma Krishnan wrote:
> The following Oops can occur if an internal command sent to the AFU does
> not complete within the timeout:
>
> [c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash]
> [c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash]
> [c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230
> [cxlflash]
> [c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl]
> [c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl]
> [c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0
> [c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
> [c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580
> [c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
> [c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
> [c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0
> [c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4
>
> When an internal command times out, the command buffer is freed while it
> is still in the pending commands list of the context. This corrupts the
> list and when the context is cleaned up, a crash is encountered.
>
> To resolve this issue, when an AFU command or TMF command times out, the
> command should be deleted from the hardware queue pending command list
> before freeing the buffer.
>
> Signed-off-by: Uma Krishnan <ukrishn at linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs at linux.vnet.ibm.com>
More information about the Linuxppc-dev
mailing list