[Skiboot] [PATCH v3] phb4: Check for RX errors after link training

Stewart Smith stewart at linux.vnet.ibm.com
Fri Nov 2 18:59:10 AEDT 2018


Michael Neuling <mikey at neuling.org> writes:
> From: Oliver O'Halloran <oohall at gmail.com>
>
> Some PHB4 PHYs can get stuck in a bad state where they are constantly
> retraining the link. This happens transparently to skiboot and Linux
> but will causes PCIe to be slow. Resetting the PHB4 clears the
> problem.
>
> We can detect this case by looking at the RX errors count where we
> check for link stability. This patch does this by modifying the link
> optimal code to check for RX errors. If errors are occurring we
> retrain the link irrespective of the chip rev or card.
>
> Normally when this problem occurs, the RX error count is maxed out at
> 255. When there is no problem, the count is 0. We chose 8 as the max
> rx errors value to give us some margin for a few errors. There is also
> a knob that can be used to set the error threshold for when we should
> retrain the link. ie
>
>   nvram -p ibm,skiboot --update-config phb-rx-err-max=8
>
> Signed-off-by: Oliver O'Halloran <oohall at gmail.com>
> Signed-off-by: Michael Neuling <mikey at neuling.org>

Cheers.

Figured this should also go to stable as everybody's been asking me,
so merged to master as of 9597a12ef4b3644e4b8644f659bec04ca139b7f9
and to 6.0.x as of 125cecaa0f236cc01cf527b432f74e4de69f3d12,  which made
it into 6.0.11

-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Skiboot mailing list