[PATCH 0/8] CXL EEH Handling

Daniel Axtens dja at axtens.net
Tue Jul 14 12:29:26 AEST 2015

CXL accelerators are unfortunately not immune from failure. This patch
set enables them to particpate in the Extended Error Handling process.

This series starts with a number of preparatory patches:

 - Patch 1 creates a kernel flag that allows us to confidently assert
   the hardware will not change when it's reset.
 - Patch 2 makes sure we don't touch the hardware when it has failed.
 - Patches 3-5 make the 'unplug' functions idempotent, so that if we
   get part way through recovery and then fail, being completely
   unplugged as part of removal doesn't cause us to oops out.

 - Patches 6 and 7 refactor init and teardown paths for the adapter
   and AFUs, so that they can be configured and deconfigured
   separately from their allocation and release.

Patch 8 enables EEH, both for the CXL card, and anything attached to
the virtual PHB. Only complete slot resets are supported.

Daniel Axtens (8):
  cxl: Allow the kernel to trust that an image won't change on PERST.
  cxl: Drop commands if the PCI channel is not in normal state
  cxl: Allocate and release the SPA with the AFU
  cxl: Make IRQ release idempotent
  cxl: Clean up adapter MMIO unmap path.
  cxl: Refactor adaptor init/teardown
  cxl: Refactor AFU init/teardown
  cxl: EEH support

 Documentation/ABI/testing/sysfs-class-cxl |  10 +
 drivers/misc/cxl/api.c                    |   7 +
 drivers/misc/cxl/cxl.h                    |  38 ++-
 drivers/misc/cxl/file.c                   |  20 ++
 drivers/misc/cxl/irq.c                    |   9 +
 drivers/misc/cxl/native.c                 | 100 +++++-
 drivers/misc/cxl/pci.c                    | 498 ++++++++++++++++++++++++------
 drivers/misc/cxl/sysfs.c                  |  26 ++
 include/misc/cxl.h                        |  10 +
 9 files changed, 602 insertions(+), 116 deletions(-)


