[Skiboot] skiboot 6.0.5 released

Wed Jul 11 15:22:31 AEST 2018

skiboot-6.0.5
*************

skiboot 6.0.5 was released on Wednesday July 11th, 2018. It replaces
skiboot-6.0.4 as the current stable release in the 6.0.x series.

It is recommended that 6.0.5 be used instead of any previous 6.0.x
version.

Over skiboot-6.0.4 we have several bug fixes, including important ones
for NVLINK2 and NX.

PCI/PHB4
========

* phb4: Delay training till after PERST is deasserted

  This helps some cards train on the second PERST (ie fast-reboot).
  The reason is not clear why but it helps, so YOLO!

* pci: Fix PCI_DEVICE_ID()

  The vendor ID is 16 bits not 8. This error leaves the top of the
  vendor ID in the bottom bits of the device ID, which resulted in
  e.g. a failure to run the PCI quirk for the AST VGA device.

  Fixes: 2b841bf0ef1b (present in v5.7-rc1)

PHB4/CAPI
=========

* phb4/capp: Calculate STQ/DMA read engines based on link-width for
  PEC

  Presently in CAPI mode the number of STQ/DMA-read engines allocated
  on PEC2 for CAPP is fixed to 6 and 0-30 respectively irrespective of
  the PCI link width. These values are only suitable for x8 cards and
  quickly run out if a x16 card is plugged to a PEC2 attached slot.
  This usually manifests as CAPP reporting TLBI timeout due to these
  messages getting stalled due to insufficient STQs.

  To fix this we update enable_capi_mode() to check if PEC2 chiplet is
  in x16 mode and if yes then we allocate 4/0-47 STQ/DMA-read engines
  for the CAPP traffic.

* capi: Select the correct IODA table entry for the mbt cache.

  With the current code, the capi mmio window is not correctly
  configured in the IODA table entry. The first entry (generally the
  non-prefetchable BAR) is overwrriten. This patch sets the capi
  window bar at the right place.

Sensors
=======

* occ: sensors: Fix the size of the phandle array ‘sensors’ in DT

  Fixes: 99505c03f493 (present in v5.10-rc4)

NPU2/NVLINK2
============

* npu2/hw-procedures: Fence bricks via NTL instead of MISC

  There are a couple of places we can set/unset fence for a brick:

  1. MISC register: NPU2_MISC_FENCE_STATE

  2. NTL register for the brick: NPU2_NTL_MISC_CFG1(ndev)

  Recent testing of ATS in combination with GPU reset has exposed a
  side effect of using (1); if fence is set for all six bricks, it
  triggers a sticky nmmu latch which prevents the NPU from getting ATR
  responses. This manifests as a hang in the tests.

  We have npu2_dev_fence_brick() which uses (1), and only two calls to
  it. Replace the call which sets fence with a write to (2). Remove
  the corresponding unset call entirely. It’s unneeded because the
  procedures already do a progression from full fence to half to idle
  using (2).

* opal/hmi: Display correct chip id while printing NPU FIRs.

  HMIs for NPU xstops are broadcasted to all chips. All cores on all
  the chips receive HMI. HMI handler correctly identifies and extracts
  the NPU FIR details from affected chip, but while printing FIR data
  it prints chip id and location code details of this_cpu()->chip_id
  which may not be correct. This patch fixes this issue.

  Fixes: 7bcbc78c (present in v6.0.1)

VPD
===

* vpd: Add vendor property to processor node

  Processor FRU vpd doesn’t contain vendor detail. We have to parse
  module VPD to get vendor detail.

* vpd: Sanitize VPD data

  On OpenPower system, VPD keyword size tells us the maximum size of
  the data. But they fill trailing end with space (0x20) instead of
  NULL. Also spec doesn’t stop user to have space (0x20) within actual
  data.

  This patch discards trailing spaces before populating device tree.

NX/VAS for POWER9
=================

* NX: Add NX coprocessor init opal call

  The read offset (4:11) in Receive FIFO control register is
  incremented by FIFO size whenever CRB read by NX. But the index in
  RxFIFO has to match with the corresponding entry in FIFO maintained
  by VAS in kernel. VAS entry is reset to 0 when opening the receive
  window during driver initialization. So when NX842 is reloaded or in
  kexec boot, possibility of mismatch between RxFIFO control register
  and VAS entries in kernel. It could cause CRB failure / timeout from
  NX.

  This patch adds nx_coproc_init opal call for kernel to initialize
  readOffset (4:11) and Queued (15:23) in RxFIFO control register.

  Fixes: 3b3c5962f432 (present in v5.8-rc1)

-- 
Stewart Smith
OPAL Architect, IBM.