[RFC PATCH 21/23] watchdog/hardlockup/hpet: Adjust timer expiration on the number of monitored CPUs

Ricardo Neri ricardo.neri-calderon at linux.intel.com
Wed Jun 13 10:57:41 AEST 2018


Each CPU should be monitored for hardlockups every watchdog_thresh seconds.
Since all the CPUs in the system are monitored by the same timer and the
timer interrupt is rotated among the monitored CPUs, the timer must expire
every watchdog_thresh/N seconds; where N is the number of monitored CPUs.

A new member is added to struct hpet_wdt_data to determine the per-CPU
ticks per second. This quantity is used to program the comparator of the
timer.

The ticks-per-CPU quantity is updated every time when the number of
monitored CPUs changes: when the watchdog is enabled or disabled for
a specific CPU.

Cc: Ashok Raj <ashok.raj at intel.com>
Cc: Andi Kleen <andi.kleen at intel.com>
Cc: Tony Luck <tony.luck at intel.com>
Cc: Borislav Petkov <bp at suse.de>
Cc: Jacob Pan <jacob.jun.pan at intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki at intel.com>
Cc: Don Zickus <dzickus at redhat.com>
Cc: Nicholas Piggin <npiggin at gmail.com>
Cc: Michael Ellerman <mpe at ellerman.id.au>
Cc: Frederic Weisbecker <frederic at kernel.org>
Cc: Alexei Starovoitov <ast at kernel.org>
Cc: Babu Moger <babu.moger at oracle.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers at efficios.com>
Cc: Masami Hiramatsu <mhiramat at kernel.org>
Cc: Peter Zijlstra <peterz at infradead.org>
Cc: Andrew Morton <akpm at linux-foundation.org>
Cc: Philippe Ombredanne <pombredanne at nexb.com>
Cc: Colin Ian King <colin.king at canonical.com>
Cc: Byungchul Park <byungchul.park at lge.com>
Cc: "Paul E. McKenney" <paulmck at linux.vnet.ibm.com>
Cc: "Luis R. Rodriguez" <mcgrof at kernel.org>
Cc: Waiman Long <longman at redhat.com>
Cc: Josh Poimboeuf <jpoimboe at redhat.com>
Cc: Randy Dunlap <rdunlap at infradead.org>
Cc: Davidlohr Bueso <dave at stgolabs.net>
Cc: Christoffer Dall <cdall at linaro.org>
Cc: Marc Zyngier <marc.zyngier at arm.com>
Cc: Kai-Heng Feng <kai.heng.feng at canonical.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
Cc: David Rientjes <rientjes at google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar at intel.com>
Cc: x86 at kernel.org
Cc: iommu at lists.linux-foundation.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon at linux.intel.com>
---
 arch/x86/include/asm/hpet.h |  1 +
 kernel/watchdog_hld_hpet.c  | 41 ++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 6ace2d1..e67818d 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -124,6 +124,7 @@ struct hpet_hld_data {
 	u32		irq;
 	u32		flags;
 	u64		ticks_per_second;
+	u64		ticks_per_cpu;
 	struct cpumask	monitored_mask;
 	spinlock_t	lock; /* serialized access to monitored_mask */
 };
diff --git a/kernel/watchdog_hld_hpet.c b/kernel/watchdog_hld_hpet.c
index c40acfd..ebb820d 100644
--- a/kernel/watchdog_hld_hpet.c
+++ b/kernel/watchdog_hld_hpet.c
@@ -65,11 +65,21 @@ static void kick_timer(struct hpet_hld_data *hdata)
 	 * are able to update the comparator before the counter reaches such new
 	 * value.
 	 *
+	 * The timer must monitor each CPU every watch_thresh seconds. Hence the
+	 * timer expiration must be:
+	 *
+	 *    watch_thresh/N
+	 *
+	 * where N is the number of monitored CPUs.
+	 *
+	 * in order to monitor all the online CPUs. ticks_per_cpu gives the
+	 * number of ticks needed to meet the condition above.
+	 *
 	 * Let it wrap around if needed.
 	 */
 	count = get_count();
 
-	new_compare = count + watchdog_thresh * hdata->ticks_per_second;
+	new_compare = count + watchdog_thresh * hdata->ticks_per_cpu;
 
 	set_comparator(hdata, new_compare);
 }
@@ -160,6 +170,33 @@ static bool is_hpet_wdt_interrupt(struct hpet_hld_data *hdata)
 }
 
 /**
+ * update_ticks_per_cpu() - Update the number of HPET ticks per CPU
+ * @hdata:	struct with the timer's the ticks-per-second and CPU mask
+ *
+ * From the overall ticks-per-second of the timer, compute the number of ticks
+ * after which the timer should expire to monitor each CPU every watch_thresh
+ * seconds. The ticks-per-cpu quantity is computed using the number of CPUs that
+ * the watchdog currently monitors.
+ *
+ * Returns:
+ *
+ * None
+ *
+ */
+static void update_ticks_per_cpu(struct hpet_hld_data *hdata)
+{
+	unsigned int num_cpus = cpumask_weight(&hdata->monitored_mask);
+	unsigned long long temp = hdata->ticks_per_second;
+
+	/* Only update if there are monitored CPUs. */
+	if (!num_cpus)
+		return;
+
+	do_div(temp, num_cpus);
+	hdata->ticks_per_cpu = temp;
+}
+
+/**
  * hardlockup_detector_irq_handler() - Interrupt handler
  * @irq:	Interrupt number
  * @data:	Data associated with the interrupt
@@ -390,6 +427,7 @@ static void hardlockup_detector_hpet_enable(void)
 	spin_lock(&hld_data->lock);
 
 	cpumask_set_cpu(cpu, &hld_data->monitored_mask);
+	update_ticks_per_cpu(hld_data);
 
 	/*
 	 * If this is the first CPU to be monitored, set everything in motion:
@@ -425,6 +463,7 @@ static void hardlockup_detector_hpet_disable(void)
 	spin_lock(&hld_data->lock);
 
 	cpumask_clear_cpu(smp_processor_id(), &hld_data->monitored_mask);
+	update_ticks_per_cpu(hld_data);
 
 	/* Only disable the timer if there are no more CPUs to monitor. */
 	if (!cpumask_weight(&hld_data->monitored_mask))
-- 
2.7.4



More information about the Linuxppc-dev mailing list