[Skiboot] [PATCH 0/3] npu2: Additional hw glitch mitigation

Stewart Smith stewart at linux.vnet.ibm.com
Wed Nov 29 08:28:15 AEDT 2017


Reza Arbab <arbab at linux.vnet.ibm.com> writes:
> I think we have finally gotten ahead of the glitching DL clock mux that
> is causing so much trouble for NVLink training/stability.
>
> With these changes, we've tested hundreds of boot cycles without tripping
> the check_credits safeguard added to detect training error. Before, we
> were dealing with something like a 1-2% failure rate.
>
> Reza Arbab (3):
>   npu2: hw-procedures: Add obus_brick_index()
>   npu2: hw-procedures: Manipulate IOVALID during training
>   npu2: hw-procedures: Change phy_rx_clock_sel values
>
>  hw/npu2-hw-procedures.c | 61 +++++++++++++++++++++++++++++++++----------------
>  1 file changed, 41 insertions(+), 20 deletions(-)

NVLink magic numbers merged to master as of
878c718aed200cc2b6b7c6bca3a6e2fa2351ec95 and heading for a 5.9.4

-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Skiboot mailing list