[PATCH] pseries/hotplug: Add more delay in pseries_cpu_die while waiting for rtas-stop

Christophe Leroy christophe.leroy at csgroup.eu
Thu Jul 7 19:51:29 AEST 2022



Le 06/12/2018 à 12:31, Gautham R. Shenoy a écrit :
> From: "Gautham R. Shenoy" <ego at linux.vnet.ibm.com>
> 
> Currently running DLPAR offline/online operations in a loop on a
> POWER9 system with SMT=off results in the following crash:
> 
> [  223.321032] cpu 112 (hwid 112) Ready to die...
> [  223.355963] Querying DEAD? cpu 113 (113) shows 2
> [  223.356233] cpu 114 (hwid 114) Ready to die...
> [  223.356236] cpu 113 (hwid 113) Ready to die...
> [  223.356241] Bad kernel stack pointer 1faf6ca0 at 1faf6d50
> [  223.356243] Oops: Bad kernel stack pointer, sig: 6 [#1]
> [  223.356244] LE SMP NR_CPUS=2048 NUMA pSeries
> [  223.356247] Modules linked in:
> [  223.356255] CPU: 114 PID: 0 Comm: swapper/114 Not tainted 4.20.0-rc3 #39
> [  223.356258] NIP:  000000001faf6d50 LR: 000000001faf6d50 CTR: 000000001ec6d06c
> [  223.356259] REGS: c00000001e5cbd30 TRAP: 0700   Not tainted  (4.20.0-rc3)
> [  223.356260] MSR:  8000000000081000 <SF,ME>  CR: 28000004  XER: 00000020
> [  223.356263] CFAR: 000000001ec98590 IRQMASK: 8000000000009033
>                 GPR00: 000000001faf6d50 000000001faf6ca0 000000001ed1c448 00000267e6a273c3
>                 GPR04: 0000000000000000 00000000000000e0 000000000000dfe8 000000001faf6d30
>                 GPR08: 000000001faf6d28 00000267e6a273c3 000000001ec1b108 0000000000000000
>                 GPR12: 0000000001b6d998 c00000001eb55080 c0000003a1b8bf90 000000001eea3f40
>                 GPR16: 0000000000000000 c0000006fda85100 c00000000004c8b0 c0000000013d5300
>                 GPR20: c0000006fda85300 0000000000000008 c0000000019d2cf8 c0000000013d6888
>                 GPR24: 0000000000000072 c0000000013d688c 0000000000000002 c0000000013d688c
>                 GPR28: c0000000019cecf0 0000000000000390 0000000000000000 000000001faf6ca0
> [  223.356281] NIP [000000001faf6d50] 0x1faf6d50
> [  223.356281] LR [000000001faf6d50] 0x1faf6d50
> [  223.356282] Call Trace:
> [  223.356283] Instruction dump:
> [  223.356285] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [  223.356286] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [  223.356290] ---[ end trace f46a4e046b564d1f ]---
> 
> This is due to multiple offlined CPUs (CPU 113 and CPU 114 above)
> concurrently (within 3us) trying to make the rtas-call with the
> "stop-self" token, something that is prohibited by the PAPR.
> 
> The concurrent calls can happen as follows.
> 
>    o In dlpar_offline_cpu() we prod an offline CPU X (offline due to
>      SMT=off) and loop for 25 tries in pseries_cpu_die() querying if
>      the target CPU X has been stopped in RTAS. After 25 tries, we
>      prints the message "Querying DEAD? cpu X (X) shows 2" and return
>      to dlpar_offline_cpu(). Note that at this point CPU X has not yet
>      called rtas with the "stop-self" token, but can do so anytime now.
> 
>    o Back in dlpar_offline_cpu(), we prod the next offline CPU Y. CPU Y
>      promptly wakes up and calls RTAS with the "stop-self" token.
> 
>    o Before RTAS can stop CPU Y, CPU X also calls RTAS with the
>      "stop-self" token.
> 
> The problem occurs due to the short number of tries (25) in
> pseries_cpu_die() which covers 90% of the cases. For the remaining 10%
> of the cases, it was observed that we would need to loop for a few
> hundred iterations before the target CPU calls rtas with "stop-self"
> token.Experiments show that each try takes roughly ~25us.
> 
> In this patch we fix the problem by increasing the number of tries
> from 25 to 4000 (which roughly corresponds to 100ms) before bailing
> out and declaring that we have failed to observe the target CPU call
> rtas-stop-self. This provides sufficient serialization to ensure that
> there no concurrent rtas calls with "stop-self" token.
> 
> Reported-by: Michael Bringmann <mwb at linux.vnet.ibm.com>
> Signed-off-by: Gautham R. Shenoy <ego at linux.vnet.ibm.com>

This patch badly conflicts with 801980f64979 
("powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death")

Is it still applicable ? If so, please rebase and resubmit.

Thanks
Christophe

> ---
>   arch/powerpc/platforms/pseries/hotplug-cpu.c | 13 ++++++++++++-
>   1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index 2f8e621..c913c44 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -214,14 +214,25 @@ static void pseries_cpu_die(unsigned int cpu)
>   			msleep(1);
>   		}
>   	} else if (get_preferred_offline_state(cpu) == CPU_STATE_OFFLINE) {
> +		int max_tries = 4000; /* Roughly corresponds to 100ms */
> +		u64 tb_before = mftb();
>   
> -		for (tries = 0; tries < 25; tries++) {
> +		for (tries = 0; tries < max_tries; tries++) {
>   			cpu_status = smp_query_cpu_stopped(pcpu);
>   			if (cpu_status == QCSS_STOPPED ||
>   			    cpu_status == QCSS_HARDWARE_ERROR)
>   				break;
>   			cpu_relax();
>   		}
> +
> +		if (tries == max_tries) {
> +			u64 time_elapsed_us = div_u64(mftb() - tb_before,
> +						      tb_ticks_per_usec);
> +
> +			pr_warn("Offlined CPU %d isn't stopped by RTAS after %llu us\n",
> +				cpu, time_elapsed_us);
> +			WARN_ON(1);
> +		}
>   	}
>   
>   	if (cpu_status != 0) {


More information about the Linuxppc-dev mailing list