Next March 25: Boot failure on powerpc [recursive locking detected]

Sachin Sant sachinp at in.ibm.com
Thu Mar 26 01:13:08 EST 2009


Today's next failed to boot on a powerpc box
(Power6 blade IBM,7998-61X) with following recursive locking message.

=============================================
[ INFO: possible recursive locking detected ]
2.6.29-next-20090325 #1
---------------------------------------------
khelper/1202 is trying to acquire lock:
 (&q->__queue_lock){..-...}, at: [<c0000000002f44d0>] .blk_end_io+0x88/0xd8

but task is already holding lock:
 (&q->__queue_lock){..-...}, at: [<c000000000407874>] .__scsi_queue_insert+0xc4/
0x128

other info that might help us debug this:
1 lock held by khelper/1202:
 #0:  (&q->__queue_lock){..-...}, at: [<c000000000407874>] .__scsi_queue_insert+
0xc4/0x128

stack backtrace:
Call Trace:
[c00000000ffff940] [c0000000000115f4] .show_stack+0x70/0x184 (unreliable)
[c00000000ffff9f0] [c0000000000964c0] .validate_chain+0x6a8/0xe64
[c00000000ffffab0] [c000000000097494] .__lock_acquire+0x818/0x8e0
[c00000000ffffba0] [c000000000097664] .lock_acquire+0x108/0x154
[c00000000ffffc60] [c00000000058d2e4] ._spin_lock_irqsave+0x54/0x84
[c00000000ffffd00] [c0000000002f44d0] .blk_end_io+0x88/0xd8
[c00000000ffffda0] [c000000000407888] .__scsi_queue_insert+0xd8/0x128
[c00000000ffffe50] [c0000000002fa248] .blk_done_softirq+0xb0/0xe0
[c00000000ffffee0] [c00000000006f824] .__do_softirq+0x120/0x298
[c00000000fffff90] [c00000000002ba94] .call_do_softirq+0x14/0x24
[c000000044453330] [c00000000000d674] .do_softirq+0x94/0x114
[c0000000444533d0] [c00000000006fab0] .irq_exit+0x70/0x88
[c000000044453450] [c00000000000db2c] .do_IRQ+0x1c8/0x210
[c000000044453500] [c000000000004814] hardware_interrupt_entry+0x1c/0x20
--- Exception: 501 at .raw_local_irq_restore+0x3c/0x40
    LR = .kmem_cache_alloc+0xf4/0x14c
[c0000000444537f0] [c000000000112038] .kmem_cache_alloc+0xe8/0x14c (unreliable)
[c0000000444538a0] [c00000000001154c] .alloc_thread_info+0x28/0x60
[c000000044453920] [c0000000000663b0] .copy_process+0xe4/0x1168
[c000000044453a10] [c000000000067804] .do_fork+0x194/0x438
[c000000044453b30] [c000000000011a18] .sys_clone+0x5c/0x74
[c000000044453ba0] [c000000000008788] .ppc_clone+0x8/0xc
--- Exception: c00 at .kernel_thread+0x28/0x70
    LR = .wait_for_helper+0x38/0xb0
[c000000044453e90] [0000000000000078] 0x78 (unreliable)
[c000000044453f00] [c00000000007f200] .wait_for_helper+0x24/0xb0
[c000000044453f90] [c00000000002bd9c] .kernel_thread+0x54/0x70
BUG: spinlock lockup on CPU#0, khelper/1202, c0000000449b0368
Call Trace:
[c00000000ffffb10] [c0000000000115f4] .show_stack+0x70/0x184 (unreliable)
[c00000000ffffbc0] [c000000000316710] ._raw_spin_lock+0x140/0x17c
[c00000000ffffc60] [c00000000058d2f0] ._spin_lock_irqsave+0x60/0x84
[c00000000ffffd00] [c0000000002f44d0] .blk_end_io+0x88/0xd8
[c00000000ffffda0] [c000000000407888] .__scsi_queue_insert+0xd8/0x128
[c00000000ffffe50] [c0000000002fa248] .blk_done_softirq+0xb0/0xe0
[c00000000ffffee0] [c00000000006f824] .__do_softirq+0x120/0x298
[c00000000fffff90] [c00000000002ba94] .call_do_softirq+0x14/0x24
[c000000044453330] [c00000000000d674] .do_softirq+0x94/0x114
[c0000000444533d0] [c00000000006fab0] .irq_exit+0x70/0x88
[c000000044453450] [c00000000000db2c] .do_IRQ+0x1c8/0x210
[c000000044453500] [c000000000004814] hardware_interrupt_entry+0x1c/0x20
--- Exception: 501 at .raw_local_irq_restore+0x3c/0x40
    LR = .kmem_cache_alloc+0xf4/0x14c
[c0000000444537f0] [c000000000112038] .kmem_cache_alloc+0xe8/0x14c (unreliable)
[c0000000444538a0] [c00000000001154c] .alloc_thread_info+0x28/0x60
[c000000044453920] [c0000000000663b0] .copy_process+0xe4/0x1168
[c000000044453a10] [c000000000067804] .do_fork+0x194/0x438
[c000000044453b30] [c000000000011a18] .sys_clone+0x5c/0x74
[c000000044453ba0] [c000000000008788] .ppc_clone+0x8/0xc
--- Exception: c00 at .kernel_thread+0x28/0x70
    LR = .wait_for_helper+0x38/0xb0
[c000000044453e90] [0000000000000078] 0x78 (unreliable)
[c000000044453f00] [c00000000007f200] .wait_for_helper+0x24/0xb0
[c000000044453f90] [c00000000002bd9c] .kernel_thread+0x54/0x70

I could boot Next 24 on the same machine.

Attached here is the .config and complete dmesg.

Thanks
-Sachin


-- 

---------------------------------
Sachin Sant
IBM Linux Technology Center
India Systems and Technology Labs
Bangalore, India
---------------------------------

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dmesg_log
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20090325/d0626bdb/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: config_js22_next25
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20090325/d0626bdb/attachment.asc>


More information about the Linuxppc-dev mailing list