[PATCH] powerpc/watchdog: Use hrtimers for per-CPU heartbeat
Nicholas Piggin
npiggin at gmail.com
Fri Apr 5 03:03:31 AEDT 2019
Gautham R Shenoy's on April 4, 2019 9:19 pm:
> Hello Nicholas,
>
> On Tue, Apr 2, 2019 at 4:57 PM Nicholas Piggin <npiggin at gmail.com> wrote:
>>
>> Using a jiffies timer creates a dependency on the tick_do_timer_cpu
>> incrementing jiffies. If that CPU has locked up and jiffies is not
>> incrementing, the watchdog heartbeat timer for all CPUs stops and
>> creates false positives and confusing warnings on local CPUs, and
>> also causes the SMP detector to stop, so the root cause is never
>> detected.
>>
>> Fix this by using hrtimer based timers for the watchdog heartbeat,
>> like the generic kernel hardlockup detector.
>>
>> Reported-by: Ravikumar Bangoria <ravi.bangoria at in.ibm.com>
>> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
>
> [..snip..]
>
>> @@ -325,19 +325,21 @@ EXPORT_SYMBOL(arch_touch_nmi_watchdog);
>>
>> static void start_watchdog_timer_on(unsigned int cpu)
>> {
>> - struct timer_list *t = per_cpu_ptr(&wd_timer, cpu);
>> + struct hrtimer *hrtimer = this_cpu_ptr(&wd_hrtimer);
>
> This function can be called during the initialization via
>
> watchdog_nmi_start -->
> for_each_online_cpu(cpu)
> start_wd_on_cpu(cpu) -->
> start_watchdog_timer_on(cpu)
>
> Thus, it is not guarateed that we are always calling
> start_watchdog_timer_on() from the CPU where
> we want to start the watchdog timer.
>
> Thus, should we be calling this function from start_wd_on_cpu() via an
> smp_call_function_single() ?
Good catch, yes I think we need that change (like kernel/watchdog.c).
I'll resend.
Thanks,
Nick
More information about the Linuxppc-dev
mailing list