[Skiboot] [PATCH] npu2: hw-procedures: Add check_credits procedure
alistair at popple.id.au
Wed Nov 22 09:29:11 AEDT 2017
On Tue, 21 Nov 2017 03:21:49 PM Reza Arbab wrote:
> As an immediate mitigator for a current hardware glitch, add a procedure
> that can be used to validate NTL credit values. This will be called as a
> safeguard to check that link training succeeded.
> Assert that things are exactly as we expect, because if they aren't, the
> system will experience a catastrophic failure shortly after the start of
> link traffic.
I guess we could return a procedure failure which would result in the driver
load failing but not crash the whole system. However I suppose this failure mode
is much more subtle so I agree it's probably best to just fail loud and early
given this HW state clearly indicates a bug.
One comment though - can you please add this procedure to doc/nvlink.rst in this
> #define NPU2DEVDBG(p, fmt, a...) NPU2DBG((p)->npu, fmt, ##a)
> #define NPU2DEVINF(p, fmt, a...) NPU2INF((p)->npu, fmt, ##a)
> -#define NPU2DEVERR(p, fmt, a...) NPU2ERR((p)->npu, fmt, ##a)
> +#define NPU2DEVERR(p, fmt, a...) prlog(PR_ERR, "NPU%d:%d:%d.%d " fmt, \
> + (p)->npu->phb.opal_id, \
> + ((p)->bdfn >> 8) & 0xff, \
> + ((p)->bdfn >> 3) & 0x1f, \
> + (p)->bdfn & 0x7, ##a)
Would also be nice to add this info for NPU2DEVDBG/INF.
Acked-by: Alistair Popple <alistair at popple.id.au>
> /* Number of PEs supported */
> #define NPU2_MAX_PE_NUM 16
More information about the Skiboot