funny kernel death with ksoftirqd_CPUX taking up almost 100% of cpu?
Benjamin Herrenschmidt
benh at kernel.crashing.org
Thu Jul 11 02:11:26 EST 2002
>I just experienced an alarming form of kernel death running a self compiled
>SMP kernel with HIGHMEM enabled on my dual G4 -1gig machine.
>
>The kernel tree used is Ben's 2.4.19-pre10 one rebuilt for SMP support, aec
>IDE driver and otherwise basically stock.
>
>I was debugging in gdb a large program and noticed typing got slower and
>slower. I quick check of top showed that ksoftirqd_CPU was taking up
>almost 100% of the cpu. I exited out of gdb and killed every process I
>could think of but the usage of that kernel demaon stayed at near 100%.
>
>It became so bad I could barely perform a straight shutdown (I had to hit
>return numerous times to allow the other cpu to get some time to handle
>the shutdown.
>
>There were lots of messages like the following as I tried to shutdown:
>
> ../..
>
>Jul 10 11:57:36 localhost kernel: wait_on_irq, CPU 0:
>Jul 10 11:57:36 localhost kernel: irq: -1 [0 0]
>Jul 10 11:57:36 localhost kernel: bh: 0 [0 0]
>Jul 10 11:57:37 localhost kernel:
Hrm... looks bad. global_irq_count got negative !
So either somebody is doing a mismatched hardirq_enter/leave
pair, which I seriously doubt, or our atomics are broken on
those machines (ugh !!!)
Paul, any good idea at hand ?
Ben.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc-dev
mailing list