[PATCH] powerpc/64: prevent replayed interrupt handlers from running softirqs

Nicholas Piggin npiggin at gmail.com
Fri Jan 22 11:33:01 AEDT 2021


Excerpts from Michael Ellerman's message of January 21, 2021 10:50 pm:
> Nicholas Piggin <npiggin at gmail.com> writes:
>> Running softirqs enables interrupts, which can then end up recursing
>> into the irq soft-mask code we're adjusting, including replaying
>> interrupts itself, which might be theoretically unbounded.
>>
>> This abridged trace shows how this can occur:
>>
>> ! NIP replay_soft_interrupts
>>   LR  interrupt_exit_kernel_prepare
>>   Call Trace:
>>     interrupt_exit_kernel_prepare (unreliable)
>>     interrupt_return
>>   --- interrupt: ea0 at __rb_reserve_next
>>   NIP __rb_reserve_next
>>   LR __rb_reserve_next
>>   Call Trace:
>>     ring_buffer_lock_reserve
>>     trace_function
>>     function_trace_call
>>     ftrace_call
>>     __do_softirq
>>     irq_exit
>>     timer_interrupt
>> !   replay_soft_interrupts
>>     interrupt_exit_kernel_prepare
>>     interrupt_return
>>   --- interrupt: ea0 at arch_local_irq_restore
>>
>> Fix this by disabling bhs (softirqs) around the interrupt replay.
>>
>> I don't know that commit 3282a3da25bd ("powerpc/64: Implement soft
>> interrupt replay in C") actually introduced the problem. Prior to that
>> change, the interrupt replay seems like it should still be subect to
>> this recusion, however it's done after all the irq state has been fixed
>> up at the end of the replay, so it seems reasonable to fix back to this
>> commit.
> 
> This seems very unhappy for me (on P9 bare metal):

Oh, damn lockdep I always forget it has a bunch of interrupt checks.

In this case it might have a point though. Thanks, I'll try to fix it.

Thanks,
Nick

> 
> [    0.038571] Mountpoint-cache hash table entries: 131072 (order: 4, 1048576 bytes, linear)
> [    0.040194] ------------[ cut here ]------------
> [    0.040228] WARNING: CPU: 0 PID: 0 at kernel/softirq.c:176 __local_bh_enable_ip+0x150/0x210
> [    0.040263] Modules linked in:
> [    0.040280] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.0-rc2-00008-g4899f32e4f2a #1
> [    0.040321] NIP:  c000000000114bc0 LR: c0000000000172a0 CTR: c00000000002a020
> [    0.040360] REGS: c00000000177f670 TRAP: 0700   Not tainted  (5.11.0-rc2-00008-g4899f32e4f2a)
> [    0.040410] MSR:  9000000002021033 <SF,HV,VEC,ME,IR,DR,RI,LE>  CR: 28000224  XER: 20040000
> [    0.040472] CFAR: c000000000114ae8 IRQMASK: 3
>                GPR00: c0000000000172a0 c00000000177f910 c000000001783900 c000000000017290
>                GPR04: 0000000000000200 4000000000000000 0000000000000002 00000001312d0000
>                GPR08: 0000000000000000 c0000000016f3480 0000000000000202 0000000000000000
>                GPR12: c00000000002a020 c0000000023a0000 0000000000000000 0000000000000000
>                GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>                GPR20: 0000000000000001 00000000100051c6 0000000000000000 0000000000000009
>                GPR24: 0000000000000e60 0000000000000900 0000000000000500 0000000000000a00
>                GPR28: 0000000000000f00 0000000000000002 0000000000000003 0000000000000200
> [    0.040824] NIP [c000000000114bc0] __local_bh_enable_ip+0x150/0x210
> [    0.040863] LR [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
> [    0.040904] Call Trace:
> [    0.040926] [c00000000177f910] [0000000000000500] 0x500 (unreliable)
> [    0.040962] [c00000000177f950] [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
> [    0.041008] [c00000000177fb50] [c000000000017370] arch_local_irq_restore+0x70/0xe0
> [    0.041042] [c00000000177fb80] [c000000000476514] kmem_cache_alloc+0x474/0x520
> [    0.041066] [c00000000177fc00] [c0000000004e394c] __d_alloc+0x4c/0x2e0
> [    0.041109] [c00000000177fc50] [c0000000004e40ac] d_make_root+0x3c/0xa0
> [    0.041142] [c00000000177fc80] [c000000000679ce0] ramfs_fill_super+0x80/0xb0
> [    0.041186] [c00000000177fcb0] [c0000000004c1b04] get_tree_nodev+0xb4/0x130
> [    0.041230] [c00000000177fcf0] [c000000000679578] ramfs_get_tree+0x28/0x40
> [    0.041282] [c00000000177fd10] [c0000000004bee78] vfs_get_tree+0x48/0x120
> [    0.041325] [c00000000177fd80] [c0000000004f7fe0] vfs_kern_mount.part.0+0xd0/0x130
> [    0.041368] [c00000000177fdc0] [c000000001366700] mnt_init+0x1c8/0x2fc
> [    0.041420] [c00000000177fe60] [c000000001366178] vfs_caches_init+0x110/0x138
> [    0.041454] [c00000000177fee0] [c000000001331020] start_kernel+0x6d8/0x780
> [    0.041497] [c00000000177ff90] [c00000000000d354] start_here_common+0x1c/0x5c8
> [    0.041539] Instruction dump:
> [    0.041555] e9490002 394a0001 91490000 e90d0028 3d42ffcc 394a4730 7d0a42aa e9490002
> [    0.041608] 2c280000 394affff 91490000 4082ff30 <0fe00000> 892d0988 39400001 994d0988
> [    0.041660] irq event stamp: 555
> [    0.041674] hardirqs last  enabled at (553): [<c00000000047654c>] kmem_cache_alloc+0x4ac/0x520
> [    0.041707] hardirqs last disabled at (554): [<c000000000017368>] arch_local_irq_restore+0x68/0xe0
> [    0.041750] softirqs last  enabled at (0): [<0000000000000000>] 0x0
> [    0.041778] softirqs last disabled at (555): [<c000000000016fd0>] replay_soft_interrupts+0x10/0x340
> [    0.041824] ---[ end trace aa6f9769e07e43db ]---
> 
> 
> And lots and lots of these, or similar:
> 
> 
> [   14.369838] =============================
> [   14.369839] WARNING: suspicious RCU usage
> [   14.369841] 5.11.0-rc2-00008-g4899f32e4f2a #1 Tainted: G        W
> [   14.369843] -----------------------------
> [   14.369844] include/linux/rcupdate.h:692 rcu_read_unlock() used illegally while idle!
> [   14.369846]
>                other info that might help us debug this:
> 
> [   14.369848]
>                rcu_scheduler_active = 2, debug_locks = 1
> [   14.369850] RCU used illegally from extended quiescent state!
> [   14.369851] 2 locks held by swapper/32/0:
> [   14.369853]  #0: c0000000015e6fc0 (rcu_callback){....}-{0:0}, at: rcu_core+0x2e0/0x990
> [   14.369864]  #1: c0000000015e6f30 (rcu_read_lock){....}-{1:3}, at: kmem_cache_free+0x7cc/0x7e0
> [   14.369874]
>                stack backtrace:
> [   14.369876] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G        W         5.11.0-rc2-00008-g4899f32e4f2a #1
> [   14.369879] Call Trace:
> [   14.369880] [c000001fff557c10] [c0000000008630b8] dump_stack+0xec/0x144 (unreliable)
> [   14.369886] [c000001fff557c60] [c0000000001ad2d0] lockdep_rcu_suspicious+0x124/0x144
> [   14.369890] [c000001fff557cf0] [c00000000047783c] kmem_cache_free+0x2ac/0x7e0
> [   14.369894] [c000001fff557db0] [c0000000004bdeac] file_free_rcu+0x5c/0xa0
> [   14.369898] [c000001fff557de0] [c0000000001e214c] rcu_core+0x33c/0x990
> [   14.369902] [c000001fff557e90] [c000000000f496d0] __do_softirq+0x180/0x688
> [   14.369906] [c000001fff557f90] [c0000000000307bc] call_do_softirq+0x14/0x24
> [   14.369911] [c000000002e1fab0] [c000000000017418] do_softirq_own_stack+0x38/0x50
> [   14.369916] [c000000002e1fad0] [c000000000114a60] do_softirq+0x120/0x130
> [   14.369920] [c000000002e1fb00] [c000000000114c64] __local_bh_enable_ip+0x1f4/0x210
> [   14.369924] [c000000002e1fb40] [c0000000000172a0] replay_soft_interrupts+0x2e0/0x340
> [   14.369928] [c000000002e1fd40] [c000000000017370] arch_local_irq_restore+0x70/0xe0
> [   14.369933] [c000000002e1fd70] [c000000000c87184] snooze_loop+0x64/0x2e4
> [   14.369937] [c000000002e1fdb0] [c000000000c84204] cpuidle_enter_state+0x2e4/0x550
> [   14.369941] [c000000002e1fe10] [c000000000c8450c] cpuidle_enter+0x4c/0x70
> [   14.369946] [c000000002e1fe50] [c00000000016892c] call_cpuidle+0x4c/0x90
> [   14.369949] [c000000002e1fe70] [c000000000168d74] do_idle+0x2f4/0x380
> [   14.369953] [c000000002e1ff10] [c000000000169208] cpu_startup_entry+0x38/0x40
> [   14.369957] [c000000002e1ff40] [c000000000053484] start_secondary+0x2a4/0x2b0
> [   14.369961] [c000000002e1ff90] [c00000000000d254] start_secondary_prolog+0x10/0x14
> 
> 
> cheers
> 


More information about the Linuxppc-dev mailing list