Hardware Watchdog Device in pSeries?

Mike Strosaker strosake at austin.ibm.com
Thu Oct 14 05:57:35 EST 2004

Linas Vepstas wrote:
> I might have volunteered to hack this up real quick, were it not for
> Mike Strosaker's correction, that the surveillance featues were taken
> out of Power5.   
> Anyone on this list know why?

I sent the reason I got from the hardware RAS folks to this list a while back.
Luckily, it's still in my sent mail folder:

"Because of the virtualization layer and partitioning, the surveillance
requirement was moved to PHYP<->SP.  Apparently, this was a hotly
contested issue among the platform design folks (especially considering that
partitioned power4 systems still have OS<->SP surveillance).  I think the logic
is: If an OS goes down, its not likely a server problem, hence no requirement
to monitor from the server side.

At least the platform gets notified of panics via os-term.  I gather
that some user space tools are expected to monitor for deadlocks/hangs
(maybe clustering tools). "


