[PATCH v1] powerpc/watchdog: help remote CPUs to flush NMI printk output

Daniel Axtens dja at axtens.net
Fri Nov 12 17:08:52 AEDT 2021


Hi,

> The printk layer at the moment does not seem to have a good way to force
> flush printk messages that are created in NMI context, except in the
> panic path.
>
> NMI-context printk messages normally get to the console with irq_work,
> but that won't help if the CPU is stuck with irqs disabled, as can be
> the case for hard lockup watchdog messages.
>
> The watchdog currently flushes the printk buffers after detecting a
> lockup on remote CPUs, but they may not have processed their NMI IPI
> yet by that stage, or they may have self-detected a lockup in which
> case they won't go via this NMI IPI path.
>
> Improve the situation by having NMI-context mark a flag if it called
> printk, and have watchdog timer interrupts check if that flag was set
> and try to flush if it was. Latency is not a big problem because we
> were already stuck for a while, just need to try to make sure the
> messages eventually make it out.

Initially I was surprised that this doesn't affect the printk code
itself, just the powerpc code...

>
> Cc: Laurent Dufour <ldufour at linux.ibm.com>
> Signed-off-by: Nicholas Piggin <npiggin at gmail.com>
> ---
> This patch is actually based on top of this one which is planned to go
> upstream in rc1/2. https://marc.info/?l=linux-kernel&m=163626070312052&w=2
>
> Prior to commit 93d102f094be that is fixed by the above, we had a printk
> flush function with a different name but basically does the same job, so
> this patch can be backported, just needs some care. I'm posting it for
> review now for feedback so it's ready to go when the printk patch is
> upstream.
>
> Thanks,
> Nick
>
>  arch/powerpc/kernel/watchdog.c | 29 +++++++++++++++++++++++------
>  1 file changed, 23 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index b6533539386b..a7b6b0691203 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -86,6 +86,7 @@ static DEFINE_PER_CPU(u64, wd_timer_tb);
>  /* SMP checker bits */
>  static unsigned long __wd_smp_lock;
>  static unsigned long __wd_reporting;
> +static unsigned long __wd_nmi_output;
>  static cpumask_t wd_smp_cpus_pending;
>  static cpumask_t wd_smp_cpus_stuck;
>  static u64 wd_smp_last_reset_tb;
> @@ -154,6 +155,18 @@ static void wd_lockup_ipi(struct pt_regs *regs)
>  	else
>  		dump_stack();
>  
> +	/*
> +	 * We printk()ed from NMI context, the contents may not get flushed
> +	 * if we return to a context with interrupts disabled because
> +	 * printk uses irq_work to schedule flushes of NMI output.
> +	 * __wd_nmi_output says there has been output from NMI context, so
> +	 * other CPUs are recruited to help flush it.
> +	 *
> +	 * xchg is not needed here (it could be a simple atomic store), but
> +	 * it gives the memory ordering and atomicity required.
> +	 */
> +	xchg(&__wd_nmi_output, 1);
> +
>  	/* Do not panic from here because that can recurse into NMI IPI layer */
>  }

I think, looking at this and the other site where __wd_nmi_output is
set, that this works because you set the flag only when you are done
printing from the non-panic lockup context on this CPU. I was initially
worried that you set this flag part way through printing, and then it
might get cleared by another CPU while you're still trying to print.
However, in this function it's right at the end - there's nothing else
left to do, and ...

>  DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
> @@ -386,6 +401,8 @@ DEFINE_INTERRUPT_HANDLER_NMI(soft_nmi_interrupt)
>  		print_irqtrace_events(current);
>  		show_regs(regs);
>  
> +		xchg(&__wd_nmi_output, 1); // see wd_lockup_ipi
> +
>  		if (sysctl_hardlockup_all_cpu_backtrace)
>  			trigger_allbutself_cpu_backtrace();

in this one, the only things that can happen afterwards are
 - a panic, which does its own flushing, and

- trigger_allbutself_cpu_backtrace(), which seems to just send IPIs, not
 do any printing of its own.

All of which is fine, but I wonder if we need a more descriptive
variable name or if the comment needs to specify that the flag should
only be set at certain times.

Kind regards,
Daniel


More information about the Linuxppc-dev mailing list