[PATCH 0/2] Disabling NMI watchdog during LPM's memory transfer
Nathan Lynch
nathanl at linux.ibm.com
Fri Jun 3 03:58:31 AEST 2022
Laurent Dufour <ldufour at linux.ibm.com> writes:
> When a partition is transferred, once it arrives at the destination node,
> the partition is active but much of its memory must be transferred from the
> start node.
>
> It depends on the activity in the partition, but the more CPU the partition
> has, the more memory to be transferred is likely to be. This causes latency
> when accessing pages that need to be transferred, and often, for large
> partitions, it triggers the NMI watchdog.
It also triggers warnings from other watchdogs and subsystems that
have soft latency requirements - softlockup, RCU, workqueue. The issue
is more general than the NMI watchdog.
> The NMI watchdog causes the CPU stack to dump where it appears to be
> stuck. In this case, it does not bring much information since it can happen
> during any memory access of the kernel.
When the site of a watchdog backtrace shows a thread stuck on a routine
memory access as opposed to something like a lock acquisition, that is
actually useful information that shouldn't be discarded. It tells us the
platform is failing to adequately virtualize partition memory. This
isn't a benign situation and it's likely to unacceptably affect real
workloads. The kernel is ideally situated to detect and warn about this.
> In addition, the NMI interrupt mechanism is not secure and can generate a
> dump system in the event that the interruption is taken while
> MSR[RI]=0.
This sounds like a general problem with that facility that isn't
specific to partition migration? Maybe it should be disabled altogether
until that can be fixed?
> Given how often hard lockups are detected when transferring large
> partitions, it seems best to disable the watchdog NMI until the memory
> transfer from the start node is complete.
At this time, I'm far from convinced. Disabling the watchdog is going to
make the underlying problems in the platform and/or network harder to
understand.
More information about the Linuxppc-dev
mailing list