[PATCH v3] PCI hotplug: rpaphp: Error out on busy status from get-sensor-state
Mahesh J Salgaonkar
mahesh at linux.ibm.com
Sat Jan 29 02:06:59 AEDT 2022
On 2021-12-09 09:02:51 Thu, Nathan Lynch wrote:
> Mahesh Salgaonkar <mahesh at linux.ibm.com> writes:
> > To avoid this issue, fix the pci hotplug driver (rpaphp) to return an error
> > if the slot presence state can not be detected immediately. Current
> > implementation uses rtas_get_sensor() API which blocks the slot check state
> > until rtas call returns success. Change rpaphp_get_sensor_state() to invoke
> > rtas_call(get-sensor-state) directly and take actions based on rtas return
> > status. This patch now errors out immediately on busy return status from
> > rtas_call.
> >
> > Please note that, only on certain PHB failures, the slot presence check
> > returns BUSY condition. In normal cases it returns immediately with a
> > correct presence state value. Hence this change has no impact on normal pci
> > dlpar operations.
>
> I was wondering about this. This seems to be saying -2/990x cannot
> happen in other cases. I couldn't find this specified in the
> architecture. It seems a bit risky to me to *always* error out on
> -2/990x - won't we have intermittent slot enable failures?
Sorry for the late response. So instead of always returning error out
how about we error out only if pe is going through EEH recovery ? During
get_adapter_status I can check if pe->state is set to EEH_PE_RECOVERING
and only then return error on busy else fallback to existing method of
rtas_get_sensor. Let me send out another version with this approach.
Thanks,
-Mahesh.
>
> > +/*
> > + * RTAS call get-sensor-state(DR_ENTITY_SENSE) return values as per PAPR:
> > + * -1: Hardware Error
> > + * -2: RTAS_BUSY
> > + * -3: Invalid sensor. RTAS Parameter Error.
> > + * -9000: Need DR entity to be powered up and unisolated before RTAS call
> > + * -9001: Need DR entity to be powered up, but not unisolated, before RTAS call
> > + * -9002: DR entity unusable
> > + * 990x: Extended delay - where x is a number in the range of 0-5
> > + */
> > +#define RTAS_HARDWARE_ERROR -1
> > +#define RTAS_INVALID_SENSOR -3
> > +#define SLOT_UNISOLATED -9000
> > +#define SLOT_NOT_UNISOLATED -9001
> > +#define SLOT_NOT_USABLE -9002
> > +
> > +static int rtas_to_errno(int rtas_rc)
> > +{
> > + int rc;
> > +
> > + switch (rtas_rc) {
> > + case RTAS_HARDWARE_ERROR:
> > + rc = -EIO;
> > + break;
> > + case RTAS_INVALID_SENSOR:
> > + rc = -EINVAL;
> > + break;
> > + case SLOT_UNISOLATED:
> > + case SLOT_NOT_UNISOLATED:
> > + rc = -EFAULT;
> > + break;
> > + case SLOT_NOT_USABLE:
> > + rc = -ENODEV;
> > + break;
> > + case RTAS_BUSY:
> > + case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX:
> > + rc = -EBUSY;
> > + break;
> > + default:
> > + err("%s: unexpected RTAS error %d\n", __func__, rtas_rc);
> > + rc = -ERANGE;
> > + break;
> > + }
> > + return rc;
> > +}
>
> These conversions look OK to me.
--
Mahesh J Salgaonkar
More information about the Linuxppc-dev
mailing list