[PATCH v3 0/5] EEH improvement

Gavin Shan shangw at linux.vnet.ibm.com
Tue Feb 25 18:28:33 EST 2014


The series of patches intends to improve reliability of EEH on PowerNV
platform. First all, we have had multiple duplicate states (flags) for
PHB and PE, so we remove those duplicate states to simplify the code.
Besides, we had corrupted PHB diag-data for case of frozen PE. In order
to solve the problem, we introduce eeh_ops->event() and notifications
are sent from EEH core to (PowerNV) platform on creating or destroying
PE instance so that we can allocate or free PHB diag-data backend. Then
we cache the PHB diag-data on the first call to eeh_ops->get_state()
and dump it afterwards, which helps to get correct PHB diag-data.

With the patchset applied, we never dump PHB diag-data for INF errors.
Instead, we just maintain statistics in /proc/powerpc/eeh_inf_err. Also,
we changed the PHB diag-data dump format for a bit to have multiple
fields per line and omits the line with all zero'd fields as Ben suggested.

v2 -> v3:
	* We don't cache the PHB diag-data, instead we just grab and
	  dump PHB diag-data on the first catch-up to avoid broken
	  PHB diag-data.
v1 -> v2:
	* Amending commit logs
	* Support eeh_ops->event() and maintain PHB diag-data on basis
	  of PE instance
	* When dumping PHB diag-data, to replace "-" with "00000000" and
	  omit the line if the fields of it are all zeros.

---

arch/powerpc/include/asm/eeh.h            |    1 -
arch/powerpc/kernel/eeh.c                 |   10 +---
arch/powerpc/kernel/eeh_driver.c          |   10 ++--
arch/powerpc/platforms/powernv/eeh-ioda.c |  137 ++++++++++++++++++++--------------------------
arch/powerpc/platforms/powernv/pci.c      |  228 ++++++++++++++++++++++++++++++++++++++++++---------------------------
arch/powerpc/platforms/powernv/pci.h      |    8 +--
6 files changed, 195 insertions(+), 199 deletions(-)

Thanks,
Gavin



More information about the Linuxppc-dev mailing list