[Skiboot] [PATCH] Fix possible deadlock with DEBUG build
Vasant Hegde
hegdevasant at linux.vnet.ibm.com
Fri Dec 11 16:31:06 AEDT 2020
Sample output from Cédric:
-------------------------
[ 88.294111649,7] cpu_idle_p9 called on cpu 0x063c with pm disabled
[ 88.289365222,7] cpu_idle_p9 called on cpu 0x025f with pm disabled
[ 88.289900684,7] cpu_idle_p9 called on cpu 0x045f with pm disabled
[ 88.302621295,7] CHIPTOD: Base TFMR=0x2512000000000000
[ 88.289899701,7] cpu_idle_p9 called on cpu 0x0456 with pm disabled
LOCK ERROR: Deadlock detected @0x30402740 (state: 0x0000000400000001)
[ 88.332264757,3] ***********************************************
[ 88.332300051,3] < assert failed at core/lock.c:32 >
[ 88.332328282,3] .
[ 88.332347335,3] .
[ 88.332364894,3] .
[ 88.332377963,3] OO__)
[ 88.332395458,3] <"__/
[ 88.332412628,3] ^ ^
[ 88.332450246,3] Fatal TRAP at 00000000300286a0 .lock_error+0x64 MSR 9000000000021002
[ 88.332501812,3] CFAR : 00000000300414f4 MSR : 9000000000021002
[ 88.332536539,3] SRR0 : 00000000300286a0 SRR1 : 9000000000021002
[ 88.332574644,3] HSRR0: 0000000030020024 HSRR1: 9000000000001000
[ 88.332610635,3] DSISR: 00000000 DAR : 0000000000000000
[ 88.332650628,3] LR : 0000000030028690 CTR : 00000000300f9fa0
[ 88.332684451,3] CR : 20002000 XER : 00000000
[ 88.332712767,3] GPR00: 0000000030028690 GPR16: 0000000032c98000
[ 88.332748046,3] GPR01: 0000000032c9b0a0 GPR17: 0000000000000000
[ 88.332784060,3] GPR02: 0000000030169d00 GPR18: 0000000000000000
[ 88.332822091,3] GPR03: 0000000032c9b310 GPR19: 0000000000000000
[ 88.332861357,3] GPR04: 0000000030041480 GPR20: 0000000000000000
[ 88.332897229,3] GPR05: 0000000000000000 GPR21: 0000000000000000
[ 88.332937051,3] GPR06: 0000000000000010 GPR22: 0000000000000000
[ 88.332968463,3] GPR07: 0000000000000000 GPR23: 0000000000000000
[ 88.333007333,3] GPR08: 000000000002cbb5 GPR24: 0000000000000000
[ 88.333041971,3] GPR09: 0000000000000000 GPR25: 0000000000000000
[ 88.333081073,3] GPR10: 0000000000000000 GPR26: 0000000000000003
[ 88.333114301,3] GPR11: 3839616263646566 GPR27: 0000000000000211
[ 88.333156040,3] GPR12: 0000000020002000 GPR28: 000000003042a134
[ 88.333189222,3] GPR13: 0000000000000000 GPR29: 0000000030402740
[ 88.333225638,3] GPR14: 0000000000000000 GPR30: 0000000000000001
[ 88.333259730,3] GPR15: 0000000000000000 GPR31: 0000000000000000
CPU 0211 Backtrace:
S: 0000000032c9b3b0 R: 0000000030028690 .lock_error+0x54
S: 0000000032c9b440 R: 0000000030028828 .add_lock_request+0xd0
S: 0000000032c9b4f0 R: 0000000030028a9c .lock_caller+0x8c
S: 0000000032c9b5a0 R: 0000000030021b30 .__mcount_stack_check+0x70
S: 0000000032c9b650 R: 00000000300fabb0 .list_check_node+0x1c
S: 0000000032c9b6f0 R: 00000000300fac98 .list_check+0x38
S: 0000000032c9b790 R: 00000000300289bc .try_lock_caller+0xac
S: 0000000032c9b830 R: 0000000030028ad8 .lock_caller+0xc8
S: 0000000032c9b8e0 R: 0000000030028d74 .lock_recursive_caller+0x54
S: 0000000032c9b980 R: 0000000030020cb8 .console_write+0x48
S: 0000000032c9ba30 R: 00000000300445a8 .vprlog+0xc8
S: 0000000032c9bc20 R: 0000000030044630 ._prlog+0x50
S: 0000000032c9bcb0 R: 0000000030029204 .cpu_idle_p9+0x74
S: 0000000032c9bd40 R: 0000000030029628 .cpu_idle_pm+0x4c
S: 0000000032c9bde0 R: 0000000030023fe0 .__secondary_cpu_entry+0xa0
S: 0000000032c9be70 R: 0000000030024034 .secondary_cpu_entry+0x40
S: 0000000032c9bf00 R: 0000000030003290 secondary_wait+0x8c
CPU 0x4:
opal_run_pollers ->
check_stacks -> takes stack_check_lock lock
prlog ->
console_write -> waits for con_lock
CPU 0x211
cpu_idle_p9 ->
prlog ->
console_write -> Takes con_lock lock
list_check_node -> tries to take stack_check_lock and hits deadlock.
I think we don't need to hold `stack_check_lock` while printing
backtraces. Instead it makes sense to hold backtrace lock (bt_lock)
and print output.
Reported-by: Cédric Le Goater <clg at kaod.org>
Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
---
core/stack.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/core/stack.c b/core/stack.c
index 05506f6b3..3edf98411 100644
--- a/core/stack.c
+++ b/core/stack.c
@@ -250,7 +250,7 @@ void check_stacks(void)
unlock(&stack_check_lock);
}
if (lowest) {
- lock(&stack_check_lock);
+ lock(&bt_lock);
prlog(PR_NOTICE, "CPU %04x lowest stack mark %lld bytes left"
" pc=%08llx token=%lld\n",
lowest->pir, lowest->stack_bot_mark, lowest->stack_bot_pc,
@@ -258,7 +258,7 @@ void check_stacks(void)
backtrace_print(lowest->stack_bot_bt,
&lowest->stack_bot_bt_metadata,
NULL, NULL, true);
- unlock(&stack_check_lock);
+ unlock(&bt_lock);
}
this_cpu()->in_mcount = false;
--
2.26.2
More information about the Skiboot
mailing list