[Skiboot] [PATCH v3 0/7] Don't checkstop on opencapi unexpected link down
Frederic Barrat
fbarrat at linux.ibm.com
Sat Apr 6 01:32:57 AEDT 2019
This series changes the system behavior when an opencapi link is going
down unexpectedly. The default configuration is to checkstop, which is
fine in an IBM environment where we have tools to debug, but it's not
helping much for people outside of IBM developing AFUs. Furthermore,
there's no reason to checkstop: we could just fence the brick, log an
error and report it to the OS. Therefore, we change the default action
of those errors to send an interrupt instead of checkstopping.
We also try to improve the NPU state being logged on the
above errors, as well as HMIs, to allow for debug.
Changelog:
v3:
- Rework "Dump (more) npu2 registers on link error and HMIs" to
address Andrew's comments
v2:
- Rework "Dump (more) npu2 registers on link error and HMIs" to
address Alexey's comments
Andrew Donnellan (1):
hw/npu2: Fix OpenCAPI PE assignment
Frederic Barrat (6):
hw/npu2: Move npu2 irq setup code to common area
hw/npu2: Use NVLink irq setup for OpenCAPI
hw/npu2: Setup an error interrupt on some opencapi FIRs
hw/npu2: Report errors to the OS if an OpenCAPI brick is fenced
hw/npu2: Dump (more) npu2 registers on link error and HMIs
opal/hmi: Never trust a cow!
core/hmi.c | 60 +-------
hw/npu2-common.c | 362 ++++++++++++++++++++++++++++++++++++++++++++
hw/npu2-opencapi.c | 186 +++++++++++++----------
hw/npu2.c | 100 ------------
include/npu2-regs.h | 15 +-
include/npu2.h | 24 ++-
6 files changed, 503 insertions(+), 244 deletions(-)
--
2.19.1
More information about the Skiboot
mailing list