[PATCH v3] PCI hotplug: rpaphp: Error out on busy status from get-sensor-state

Mahesh J Salgaonkar mahesh at linux.ibm.com
Sat Jan 29 02:06:59 AEDT 2022


On 2021-12-09 09:02:51 Thu, Nathan Lynch wrote:
> Mahesh Salgaonkar <mahesh at linux.ibm.com> writes:
> > To avoid this issue, fix the pci hotplug driver (rpaphp) to return an error
> > if the slot presence state can not be detected immediately. Current
> > implementation uses rtas_get_sensor() API which blocks the slot check state
> > until rtas call returns success. Change rpaphp_get_sensor_state() to invoke
> > rtas_call(get-sensor-state) directly and take actions based on rtas return
> > status. This patch now errors out immediately on busy return status from
> > rtas_call.
> >
> > Please note that, only on certain PHB failures, the slot presence check
> > returns BUSY condition. In normal cases it returns immediately with a
> > correct presence state value. Hence this change has no impact on normal pci
> > dlpar operations.
> 
> I was wondering about this. This seems to be saying -2/990x cannot
> happen in other cases. I couldn't find this specified in the
> architecture. It seems a bit risky to me to *always* error out on
> -2/990x - won't we have intermittent slot enable failures?

Sorry for the late response. So instead of always returning error out
how about we error out only if pe is going through EEH recovery ? During
get_adapter_status I can check if pe->state is set to EEH_PE_RECOVERING
and only then return error on busy else fallback to existing method of
rtas_get_sensor. Let me send out another version with this approach.

Thanks,
-Mahesh.

> 
> > +/*
> > + * RTAS call get-sensor-state(DR_ENTITY_SENSE) return values as per PAPR:
> > + *    -1: Hardware Error
> > + *    -2: RTAS_BUSY
> > + *    -3: Invalid sensor. RTAS Parameter Error.
> > + * -9000: Need DR entity to be powered up and unisolated before RTAS call
> > + * -9001: Need DR entity to be powered up, but not unisolated, before RTAS call
> > + * -9002: DR entity unusable
> > + *  990x: Extended delay - where x is a number in the range of 0-5
> > + */
> > +#define RTAS_HARDWARE_ERROR	-1
> > +#define RTAS_INVALID_SENSOR	-3
> > +#define SLOT_UNISOLATED		-9000
> > +#define SLOT_NOT_UNISOLATED	-9001
> > +#define SLOT_NOT_USABLE		-9002
> > +
> > +static int rtas_to_errno(int rtas_rc)
> > +{
> > +	int rc;
> > +
> > +	switch (rtas_rc) {
> > +	case RTAS_HARDWARE_ERROR:
> > +		rc = -EIO;
> > +		break;
> > +	case RTAS_INVALID_SENSOR:
> > +		rc = -EINVAL;
> > +		break;
> > +	case SLOT_UNISOLATED:
> > +	case SLOT_NOT_UNISOLATED:
> > +		rc = -EFAULT;
> > +		break;
> > +	case SLOT_NOT_USABLE:
> > +		rc = -ENODEV;
> > +		break;
> > +	case RTAS_BUSY:
> > +	case RTAS_EXTENDED_DELAY_MIN...RTAS_EXTENDED_DELAY_MAX:
> > +		rc = -EBUSY;
> > +		break;
> > +	default:
> > +		err("%s: unexpected RTAS error %d\n", __func__, rtas_rc);
> > +		rc = -ERANGE;
> > +		break;
> > +	}
> > +	return rc;
> > +}
> 
> These conversions look OK to me.

-- 
Mahesh J Salgaonkar


More information about the Linuxppc-dev mailing list