[PATCH] powerpc/watchdog: Use hrtimers for per-CPU heartbeat
Gautham R Shenoy
ego.lkml at gmail.com
Thu Apr 4 22:19:47 AEDT 2019
Hello Nicholas,
On Tue, Apr 2, 2019 at 4:57 PM Nicholas Piggin <npiggin at gmail.com> wrote:
>
> Using a jiffies timer creates a dependency on the tick_do_timer_cpu
> incrementing jiffies. If that CPU has locked up and jiffies is not
> incrementing, the watchdog heartbeat timer for all CPUs stops and
> creates false positives and confusing warnings on local CPUs, and
> also causes the SMP detector to stop, so the root cause is never
> detected.
>
> Fix this by using hrtimer based timers for the watchdog heartbeat,
> like the generic kernel hardlockup detector.
>
> Reported-by: Ravikumar Bangoria <ravi.bangoria at in.ibm.com>
> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
[..snip..]
> @@ -325,19 +325,21 @@ EXPORT_SYMBOL(arch_touch_nmi_watchdog);
>
> static void start_watchdog_timer_on(unsigned int cpu)
> {
> - struct timer_list *t = per_cpu_ptr(&wd_timer, cpu);
> + struct hrtimer *hrtimer = this_cpu_ptr(&wd_hrtimer);
This function can be called during the initialization via
watchdog_nmi_start -->
for_each_online_cpu(cpu)
start_wd_on_cpu(cpu) -->
start_watchdog_timer_on(cpu)
Thus, it is not guarateed that we are always calling
start_watchdog_timer_on() from the CPU where
we want to start the watchdog timer.
Thus, should we be calling this function from start_wd_on_cpu() via an
smp_call_function_single() ?
>
> per_cpu(wd_timer_tb, cpu) = get_tb();
>
> - timer_setup(t, wd_timer_fn, TIMER_PINNED);
> - wd_timer_reset(cpu, t);
> + hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> + hrtimer->function = watchdog_timer_fn;
> + hrtimer_start(hrtimer, ms_to_ktime(wd_timer_period_ms),
> + HRTIMER_MODE_REL_PINNED);
> }
>
> static void stop_watchdog_timer_on(unsigned int cpu)
> {
> - struct timer_list *t = per_cpu_ptr(&wd_timer, cpu);
> + struct hrtimer *hrtimer = this_cpu_ptr(&wd_hrtimer);
>
> - del_timer_sync(t);
> + hrtimer_cancel(hrtimer);
> }
>
> static int start_wd_on_cpu(unsigned int cpu)
> --
> 2.20.1
>
--
Thanks and Regards
gautham.
More information about the Linuxppc-dev
mailing list