[Skiboot] [PATCH 0/2] Enable reporting of frozen NVLink bricks

Alistair Popple alistair at popple.id.au
Thu Jan 11 15:28:49 AEDT 2018


When a GPU does an invalid access via NVLink2 it can cause the NPU to
freeze the associated PE. An interrupt is raised when this occurs however
no interrupt handler is registered. This series fixes a bug with the
existing NPU2 interrupt setup and adds an interrupt handler to report the
error as an EEH event.

This is similar to what is done for NVLink1 and allows the operating system
to report the error instead of it being ignored.

Alistair Popple (2):
  npu2.c: Fix XIVE IRQ alignment
  npu2.c: Add PE error detection

 hw/npu2.c           | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 include/npu2-regs.h | 17 +---------------
 2 files changed, 55 insertions(+), 19 deletions(-)

-- 
2.11.0



More information about the Skiboot mailing list