[PATCH 6/7] ppc64: EEH Avoid racing reports of errors

Paul Mackerras paulus at samba.org
Wed Oct 5 21:23:11 EST 2005


Linas writes:

> 06-eeh-report-race.patch

> +/** Mark all devices that are peers of this device as failed.
> + *  Mark the device driver too, so that it can see the failure
> + *  immediately; this is critical, since some drivers poll
> + *  status registers in interrupts ... If a driver is polling,
> + *  and the slot is frozen, then the driver can deadlock in
> + *  an interrupt context, which is bad.
> + */
> +
> +static inline void __eeh_mark_slot (struct device_node *dn)
> +{
> +	while (dn) {
> +		PCI_DN(dn)->eeh_mode |= EEH_MODE_ISOLATED;
> +
> +		if (dn->child)
> +			__eeh_mark_slot (dn->child);
> +		dn = dn->sibling;
> +	}
> +}

So this does the device node that we pass in, plus all the nodes that
come after it in its parent's list of children.  On that basis I
expected you to pass in the first child of the EADS bridge, but I see:

> +	pe_dn = find_device_pe (dn);
> +	__eeh_mark_slot (pe_dn);

My understanding is that pe_dn will end up pointing to the device node
for the EADS bridge.  Shouldn't you pass in pe_dn->child here, or
alternatively rearrange __eeh_mark_slot to do the node you give it
plus its children (recursively)?

Two other comments about __eeh_mark_slot: (1) despite the comment, the
function doesn't do anything to any pci_dev or pci_driver (not that it
should be touching any pci_driver), and (2) a recursive function can't
really be inline (unless gcc is smart enough to turn arbitrary
recursive functions into iterative functions, which I doubt :).

Regards,
Paul.



More information about the Linuxppc64-dev mailing list