[PATCH] watchdog: remove HARDLOCKUP_DETECTOR_PERF
Jinchao Wang
wangjinchao600 at gmail.com
Tue Sep 16 11:46:53 AEST 2025
On 9/15/25 23:42, Doug Anderson wrote:
> Hi,
>
> On Mon, Sep 15, 2025 at 3:35 AM Peter Zijlstra <peterz at infradead.org> wrote:
>>
>> On Mon, Sep 15, 2025 at 11:26:09AM +0100, Will Deacon wrote:
>>
>>> | If all CPUs are hard locked up at the same time the buddy system
>>> | can't detect it.
>>>
>>> Ok, so why is that limitation acceptable? It looks to me like you're
>>> removing useful functionality.
>>
>> Yeah, this. I've run into this case waaay too many times to think it
>> reasonable to remove the perf/NMI based lockup detector.
>
> I am a bit curious how this comes to be in cases where you've seen it.
> What causes all CPUs to be stuck looping all with interrupts disabled
> (but still able to execute NMIs)? Certainly one can come up with a
> synthetic way to make that happen, but I would imagine it to be
> exceedingly rare in real life. Maybe all CPUs are deadlocked waiting
> on spinlocks or something? There shouldn't be a lot of other reasons
> that all CPUs should be stuck indefinitely with interrupts disabled...
> If that's what's happening, (just spitballing) I wonder if hooking
> into the slowpath of spinlocks to look for lockups would help? Maybe
> every 10000 failures to acquire the spinlock we check for a lockup?
> Obviously you could still come up with synthetic ways to make a
> non-caught watchdog, but hopefully in those types of cases we can at
> least reset the device with a hardware watchdog?
>
> Overall the issue is that it's really awkward to have both types of
> lockup detectors, especially since you've got to pick at compile time.
> The perf lockup detector has a pile of things that make it pretty
> awkward and it seems like people have been toward the buddy detector
> because of this...
>
> -Doug
Should we support both modularization and changing the backend after
boot, so that the user has the choice?
--
Jinchao
More information about the Linuxppc-dev
mailing list