[PATCH 2/2] powerpc: Add ppc64 hard lockup detector support

Anton Blanchard anton at samba.org
Tue Aug 5 14:56:21 EST 2014


The hard lockup detector uses a PMU event as a periodic NMI to
detect if we are stuck (where stuck means no timer interrupts have
occurred).

Ben's rework of the ppc64 soft disable code has made ppc64 PMU
exceptions a partial NMI. They can get disabled if an external interrupt
comes in, but otherwise PMU interrupts will fire in interrupt disabled
regions.

I wrote a kernel module to test this patch and noticed we sometimes
missed hard lockup warnings. The RCU code detected the stall first and
issued an IPI to backtrace all CPUs. Unfortunately an IPI is an external
interrupt and that will hard disable interrupts, preventing the hard
lockup detector from going off.

If I reduced the hard lockup threshold to 5 seconds:

echo 5 > /proc/sys/kernel/watchdog_thresh

Then it would beat the RCU code in detecting a stall and get a
correct backtrace out.

Another downside is that our PMCs can only count to 2^31, so even when
we ask for 10 seconds of processor cycles, we end up taking a couple
of PMU exceptions a second.

Signed-off-by: Anton Blanchard <anton at samba.org>
---

Index: b/arch/powerpc/Kconfig
===================================================================
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -145,6 +145,7 @@ config PPC
 	select HAVE_IRQ_EXIT_ON_IRQ_STACK
 	select ARCH_USE_CMPXCHG_LOCKREF if PPC64
 	select HAVE_ARCH_AUDITSYSCALL
+	select HAVE_PERF_EVENTS_NMI if PPC64
 
 config GENERIC_CSUM
 	def_bool CPU_LITTLE_ENDIAN
Index: b/arch/powerpc/include/asm/nmi.h
===================================================================
--- /dev/null
+++ b/arch/powerpc/include/asm/nmi.h
@@ -0,0 +1,4 @@
+#ifndef _ASM_NMI_H
+#define _ASM_NMI_H
+
+#endif /* _ASM_NMI_H */
Index: b/arch/powerpc/kernel/setup_64.c
===================================================================
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -796,3 +796,10 @@ unsigned long memory_block_size_bytes(vo
 struct ppc_pci_io ppc_pci_io;
 EXPORT_SYMBOL(ppc_pci_io);
 #endif
+
+#ifdef CONFIG_HARDLOCKUP_DETECTOR
+u64 hw_nmi_get_sample_period(int watchdog_thresh)
+{
+	return ppc_proc_freq * watchdog_thresh;
+}
+#endif


More information about the Linuxppc-dev mailing list