[PATCH v1 0/8] EEH Followup Fixes (II)

Gavin Shan shangw at linux.vnet.ibm.com
Fri Jul 5 12:57:26 EST 2013


The series of patches bases on linux-poerpc-next and intends to resolve
the following problems:
 
	- On pSeries platform, the EEH doesn't work after PHB hotplug
	  with "drmgr". The root cause is that the EEH resources (
	  EEH devices, EEH caches) aren't released correctly. For the
	  problem, we add one hook (pcibios_stop_dev), which is called
	  on pci_stop_and_remove_device(). In pcibios_stop_dev(), we
	  release the EEH resources.
	- Another issue is that we need put the domain (PE or PHB) into
	  quite state while doing reset on that domain. However, some
	  deivces in the domain might not have EEH sensitive drivers, or
	  even don't have driver. Those deivces can't be put into quite
	  state and possibly keep issuing PCI-CFG or MMIO request during
	  resetting the domain. That possibly causes the failure of reset
	  and eventually failure of EEH recovery. For the issue, we introduces
	  so-called "partial hotplug". That means, those devices without driver or
	  without EEH sensitive driver are removed before doing reset, and
	  plugged (probed) into the system after reset.
	- We need traverse EEH devices of one specific PE with safe variant
	  of list tranverse function. The EEH device might be removed while
	  doing iteration.
	- When doing plug for PCI bus, we need check if we need reassign the
	  resources for subordinate devices (PCI_REASSIGN_ALL_RSRC) and do that
	  accordingly.

The patchset is verified on pSeires and PowerNV platforms:

pSeries Platform
-----------------

drmgr -c phb -r -s "PHB 513"
drmgr -c phb -a -s "PHB 513"
errinjct eeh -f 1 -s net/eth2

PowerNV Platform
-----------------

cd /sys/devices/pci0005:00/0005:00:00.0/0005:01:00.0/0005:02:08.0/0005:80:00.0/0005:90:01.0
while true; do od -x config > /dev/null; sleep 1; done
echo 1 > /sys/kernel/debug/powerpc/PCI0005/err_injct

---

arch/powerpc/include/asm/eeh.h        |   24 +++++--
arch/powerpc/include/asm/pci-bridge.h |    3 +-
arch/powerpc/include/asm/pci.h        |    2 +
arch/powerpc/kernel/eeh.c             |   56 ++++++---------
arch/powerpc/kernel/eeh_driver.c      |  106 ++++++++++++++++++++++++++-
arch/powerpc/kernel/eeh_pe.c          |   43 ++++++-----
arch/powerpc/kernel/pci-common.c      |    8 ++-
arch/powerpc/kernel/pci-hotplug.c     |  129 +++++++++++++++++++++++++++------
arch/powerpc/kernel/pci_of_scan.c     |   43 ++++++++---
drivers/pci/hotplug/rpadlpar_core.c   |    1 -
drivers/pci/probe.c                   |    4 +
drivers/pci/remove.c                  |    2 +
include/linux/pci.h                   |    1 +
13 files changed, 322 insertions(+), 100 deletions(-)

Thanks,
Gavin



More information about the Linuxppc-dev mailing list