[PATCH v2 08/30] cxlflash: Fix to avoid CXL services during EEH
Brian King
brking at linux.vnet.ibm.com
Fri Sep 18 23:37:41 AEST 2015
On 09/16/2015 04:27 PM, Matthew R. Ochs wrote:
> --- a/drivers/scsi/cxlflash/main.c
> +++ b/drivers/scsi/cxlflash/main.c
> @@ -2311,6 +2311,7 @@ static int cxlflash_probe(struct pci_dev *pdev,
> cfg->lr_port = -1;
> mutex_init(&cfg->ctx_tbl_list_mutex);
> mutex_init(&cfg->ctx_recovery_mutex);
> + init_rwsem(&cfg->ioctl_rwsem);
> INIT_LIST_HEAD(&cfg->ctx_err_recovery);
> INIT_LIST_HEAD(&cfg->lluns);
>
> @@ -2365,6 +2366,19 @@ out_remove:
> }
>
> /**
> + * drain_ioctls() - wait until all currently executing ioctls have completed
> + * @cfg: Internal structure associated with the host.
> + *
> + * Obtain write access to read/write semaphore that wraps ioctl
> + * handling to 'drain' ioctls currently executing.
> + */
> +static void drain_ioctls(struct cxlflash_cfg *cfg)
> +{
> + down_write(&cfg->ioctl_rwsem);
> + up_write(&cfg->ioctl_rwsem);
> +}
> +
> +/**
> * cxlflash_pci_error_detected() - called when a PCI error is detected
> * @pdev: PCI device struct.
> * @state: PCI channel state.
> @@ -2383,16 +2397,14 @@ static pci_ers_result_t cxlflash_pci_error_detected(struct pci_dev *pdev,
> switch (state) {
> case pci_channel_io_frozen:
> cfg->state = STATE_LIMBO;
> -
> - /* Turn off legacy I/O */
> scsi_block_requests(cfg->host);
> + drain_ioctls(cfg);
So, what kicks any outstanding ioctls back? Let's assume you are in the middle of disk_attach
and you've sent the READ_CAP16 to the device. It appears as if what would happen here is we'd
sit here in cxlflash_pci_error_detected. Eventually, the READ_CAP16 would timeout. This would
wake the SCSI error handler, and end up calling your eh_device_reset handler, which would see that
we are in STATE_LIMBO, where it would then do a wait_event, waiting for us to get out of STATE_LIMBO,
and we would end up in a deadlock.
Rather than implementing a rw semaphore, would it be better to simply make the ioctls check the
state we are in and either wait to get out of EEH state or fail themselves?
> rc = cxlflash_mark_contexts_error(cfg);
> if (unlikely(rc))
> dev_err(dev, "%s: Failed to mark user contexts!(%d)\n",
> __func__, rc);
> term_mc(cfg, UNDO_START);
> stop_afu(cfg);
> -
> return PCI_ERS_RESULT_NEED_RESET;
> case pci_channel_io_perm_failure:
> cfg->state = STATE_FAILTERM;
> diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c
> index cf2a85d..8a18230 100644
> --- a/drivers/scsi/cxlflash/superpipe.c
> +++ b/drivers/scsi/cxlflash/superpipe.c
> @@ -1214,6 +1214,48 @@ static const struct file_operations null_fops = {
> };
>
> /**
> + * check_state() - checks and responds to the current adapter state
> + * @cfg: Internal structure associated with the host.
> + * @ioctl: Indicates if on an ioctl thread.
> + *
> + * This routine can block and should only be used on process context.
> + * When blocking on an ioctl thread, the ioctl read semaphore should be
> + * let up to allow for draining actively running ioctls. Also note that
> + * when waking up from waiting in reset, the state is unknown and must
> + * be checked again before proceeding.
> + *
> + * Return: 0 on success, -errno on failure
> + */
> +static int check_state(struct cxlflash_cfg *cfg, bool ioctl)
All your callers appear to set the second parameter to true, so why bother having it?
> +{
> + struct device *dev = &cfg->dev->dev;
> + int rc = 0;
> +
> +retry:
> + switch (cfg->state) {
> + case STATE_LIMBO:
> + dev_dbg(dev, "%s: Limbo state, going to wait...\n", __func__);
> + if (ioctl)
> + up_read(&cfg->ioctl_rwsem);
> + rc = wait_event_interruptible(cfg->limbo_waitq,
> + cfg->state != STATE_LIMBO);
> + if (ioctl)
> + down_read(&cfg->ioctl_rwsem);
> + if (unlikely(rc))
> + break;
> + goto retry;
> + case STATE_FAILTERM:
> + dev_dbg(dev, "%s: Failed/Terminating!\n", __func__);
> + rc = -ENODEV;
> + break;
> + default:
> + break;
> + }
> +
> + return rc;
> +}
> +
> +/**
> * cxlflash_disk_attach() - attach a LUN to a context
> * @sdev: SCSI device associated with LUN.
> * @attach: Attach ioctl data structure.
--
Brian King
Power Linux I/O
IBM Linux Technology Center
More information about the Linuxppc-dev
mailing list