PPC440 Kernel Stack overflow

Steve Boorman steveb at baydel.com
Fri Jul 16 20:49:47 EST 2004


We recently traced a system hang-up to a bug in one of our drivers.
The bug effectively caused repeated calls to itself, which caused the
Kernel stack to overflow. The surprising thing is that the machine
would just hang, no o/p on the console and all interrupts including
the timer were dead. We never got the message "Kernel stack overflow
in process" which is what I expected.

We are running a ported version of 2.4.26 on our hardware (PPC440GP
based), suspecting that something may be adrift with the port I tried
this with the stock 2.4.26 IBM ebony kernel running on the Ebony eval
board. This was done using a test driver, written as a loadable
module. The driver simulated a kernel stack overflow by repeated
calls to a module within the same module. The result was identical,
ie no messages on the console and the system completely freezes.

Am I expecting too much here, or is something wrong in the kernel
stack overflow detection?

The problem is that this type of hang is very hard to debug. We have
implemented the PPC440 watch-dog in our Kernel port, and whilst that
happily traps code spinning in a loop, it does not trap this kernel
stack problem, presumably because even critical exception interrupts
are not being processed. The watch-dog is definitely expiring.

We do not have (at the moment) a BDI2000 and wondered if it would be
any good at tracking this type of crash down anyway?

Any thoughts on this would be appreciated.


Steve Boorman

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

More information about the Linuxppc-embedded mailing list