EEH regression: PE <-> device binding lost after reset
Daniel Axtens
dja at axtens.net
Mon Aug 10 09:23:27 AEST 2015
Hi,
I'm experiencing a regression in EEH that was introduced somewhere
between 4.0 and 4.1.
I have been reproducing this with a CAPI (CXL) card, but the behaviour
isn't CAPI related and the triggering code hasn't changed. CAPI cards
are reprogrammed by PERSTing the slot they sit in, so CAPI exposes a
'reset' file in sysfs that does "pci_set_pcie_reset_state(dev,
pcie_warm_reset)", and then relies on EEH noticing to properly reset the
card.
In 4.0 and earlier, this worked: the slot would be persted, EEH would
notice and hotplug. You could do this as many times as you liked.
In 4.1 and later, you can do 1 successful reset, but any subsequent
reset causes the following to be printed in dmesg:
[ 225.118656] cxl-pci 0006:01:00.0: CXL reset
[ 225.118663] pcibios_set_pcie_reset_state: No PE found on PCI device 0006:01:00.0
[ 225.118672] cxl-pci 0006:01:00.0: cxl: pcie_warm_reset failed
I'm digging through the commits between 4.0 and 4.1 at the moment, but I
thought I'd post it here in hopes someone had an idea what the root
cause was.
--
Regards,
Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 860 bytes
Desc: This is a digitally signed message part
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20150810/ee9a9cf7/attachment.sig>
More information about the Linuxppc-dev
mailing list