[Skiboot] [PATCH skiboot] npu2: Increase timeout for L2/L3 cache purging

Stewart Smith stewart at linux.ibm.com
Thu Jun 27 15:15:37 AEST 2019


Alexey Kardashevskiy <aik at ozlabs.ru> writes:
> On NVLink2 bridge reset, we purge all L2/L3 caches in the system.
> This is an asynchronous operation, we have a 2ms timeout here. There are
> reports that this is not enough and "PURGE L3 on core xxx timed out"
> messages appear (for the reference: on the test setup this takes
> 280us..780us).
>
> This defines the timeout as a macro and changes this from 2ms to 20ms.
>
> This adds a tracepoint to tell how long it took to purge all the caches.
>
> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
> ---
>
> It would be interesting to know how long it can possibly take and if it
> depends on the actual GPU load and usage pattern.
>
> To enable or disable traces, "nvram" needs to run and then the host needs
> reboot:
>
> - enable traces:
> sudo nvram  -p ibm,skiboot --update-config log-level-memory=trace
> sudo nvram  -p ibm,skiboot --update-config log-level-driver=trace
>
> - disable traces:
> sudo nvram  -p ibm,skiboot --update-config log-level-memory=
> sudo nvram  -p ibm,skiboot --update-config log-level-driver=
> ---
>  include/npu2-regs.h |  2 ++
>  hw/npu2.c           | 20 +++++++++++++-------
>  2 files changed, 15 insertions(+), 7 deletions(-)

Merged to master as of d2005818bea35e74b8991a615ac5bee389263126.

Should this also go to stable?

-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Skiboot mailing list