[PATCH v2 0/9] EEH improvement

Gavin Shan shangw at linux.vnet.ibm.com
Tue Feb 25 16:37:41 EST 2014

The series of patches intends to improve reliability of EEH on PowerNV
platform. First all, we have had multiple duplicate states (flags) for
PHB and PE, so we remove those duplicate states to simplify the code.
Besides, we had corrupted PHB diag-data for case of frozen PE. In order
to solve the problem, we introduce eeh_ops->event() and notifications
are sent from EEH core to (PowerNV) platform on creating or destroying
PE instance so that we can allocate or free PHB diag-data backend. Then
we cache the PHB diag-data on the first call to eeh_ops->get_state()
and dump it afterwards, which helps to get correct PHB diag-data.

With the patchset applied, we never dump PHB diag-data for INF errors.
Instead, we just maintain statistics in /proc/powerpc/eeh_inf_err. Also,
we changed the PHB diag-data dump format for a bit to have multiple
fields per line and omits the line with all zero'd fields as Ben suggested.

v1 -> v2:
	* Amending commit logs
	* Support eeh_ops->event() and maintain PHB diag-data on basis
	  of PE instance
	* When dumping PHB diag-data, to replace "-" with "00000000" and
	  omit the line if the fields of it are all zeros.


arch/powerpc/include/asm/eeh.h               |    7 ++-
arch/powerpc/kernel/eeh.c                    |   10 +---
arch/powerpc/kernel/eeh_driver.c             |   10 ++--
arch/powerpc/kernel/eeh_pe.c                 |   39 ++++++++++++-
arch/powerpc/platforms/powernv/eeh-ioda.c    |  193 ++++++++++++++++++++++++++++++++++++-------------------------
arch/powerpc/platforms/powernv/eeh-powernv.c |   74 +++++++++++++++++++-----
arch/powerpc/platforms/powernv/pci.c         |  228 +++++++++++++++++++++++++++++++++++++++++-------------------------
arch/powerpc/platforms/powernv/pci.h         |   11 ++--
arch/powerpc/platforms/pseries/eeh_pseries.c |    3 +-
9 files changed, 358 insertions(+), 217 deletions(-)


More information about the Linuxppc-dev mailing list