[PATCH 4/5] PCI/ERR: Update device error_state already after reset

Sathyanarayanan Kuppuswamy sathyanarayanan.kuppuswamy at linux.intel.com
Thu Aug 14 09:43:39 AEST 2025


On 8/12/25 10:11 PM, Lukas Wunner wrote:
> After a Fatal Error has been reported by a device and has been recovered
> through a Secondary Bus Reset, AER updates the device's error_state to
> pci_channel_io_normal before invoking its driver's ->resume() callback.
>
> By contrast, EEH updates the error_state earlier, namely after resetting
> the device and before invoking its driver's ->slot_reset() callback.
> Commit c58dc575f3c8 ("powerpc/pseries: Set error_state to
> pci_channel_io_normal in eeh_report_reset()") explains in great detail
> that the earlier invocation is necessitated by various drivers checking
> accessibility of the device with pci_channel_offline() and avoiding
> accesses if it returns true.  It returns true for any other error_state
> than pci_channel_io_normal.
>
> The device should be accessible already after reset, hence the reasoning
> is that it's safe to update the error_state immediately afterwards.
>
> This deviation between AER and EEH seems problematic because drivers
> behave differently depending on which error recovery mechanism the
> platform uses.  Three drivers have gone so far as to update the
> error_state themselves, presumably to work around AER's behavior.
>
> For consistency, amend AER to update the error_state at the same recovery
> steps as EEH.  Drop the now unnecessary workaround from the three drivers.
>
> Keep updating the error_state before ->resume() in case ->error_detected()
> or ->mmio_enabled() return PCI_ERS_RESULT_RECOVERED, which causes
> ->slot_reset() to be skipped.  There are drivers doing this even for Fatal
> Errors, e.g. mhi_pci_error_detected().
>
> Signed-off-by: Lukas Wunner <lukas at wunner.de>
> ---

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy at linux.intel.com>

>   drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 1 -
>   drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c    | 2 --
>   drivers/pci/pcie/err.c                              | 3 ++-
>   drivers/scsi/qla2xxx/qla_os.c                       | 5 -----
>   4 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> index d7cdea8f604d..91e7b38143ea 100644
> --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> @@ -4215,7 +4215,6 @@ static pci_ers_result_t qlcnic_83xx_io_slot_reset(struct pci_dev *pdev)
>   	struct qlcnic_adapter *adapter = pci_get_drvdata(pdev);
>   	int err = 0;
>   
> -	pdev->error_state = pci_channel_io_normal;
>   	err = pci_enable_device(pdev);
>   	if (err)
>   		goto disconnect;
> diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> index 53cdd36c4123..e051d8c7a28d 100644
> --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> @@ -3766,8 +3766,6 @@ static int qlcnic_attach_func(struct pci_dev *pdev)
>   	struct qlcnic_adapter *adapter = pci_get_drvdata(pdev);
>   	struct net_device *netdev = adapter->netdev;
>   
> -	pdev->error_state = pci_channel_io_normal;
> -
>   	err = pci_enable_device(pdev);
>   	if (err)
>   		return err;
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 930bb60fb761..bebe4bc111d7 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -153,7 +153,8 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>   
>   	device_lock(&dev->dev);
>   	pdrv = dev->driver;
> -	if (!pdrv || !pdrv->err_handler || !pdrv->err_handler->slot_reset)
> +	if (!pci_dev_set_io_state(dev, pci_channel_io_normal) ||
> +	    !pdrv || !pdrv->err_handler || !pdrv->err_handler->slot_reset)
>   		goto out;
>   
>   	err_handler = pdrv->err_handler;
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d4b484c0fd9d..4460421834cb 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -7883,11 +7883,6 @@ qla2xxx_pci_slot_reset(struct pci_dev *pdev)
>   	       "Slot Reset.\n");
>   
>   	ha->pci_error_state = QLA_PCI_SLOT_RESET;
> -	/* Workaround: qla2xxx driver which access hardware earlier
> -	 * needs error state to be pci_channel_io_online.
> -	 * Otherwise mailbox command timesout.
> -	 */
> -	pdev->error_state = pci_channel_io_normal;
>   
>   	pci_restore_state(pdev);
>   

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer



More information about the Linuxppc-dev mailing list