Hardware Watchdog Device in pSeries?

Linas Vepstas linas at austin.ibm.com
Thu Oct 14 05:23:56 EST 2004


Hi, 

I'm copying over to the linuxppc64-dev at ozlabs.org
mailing list, which is the right place to discuss this.

On Wed, Oct 13, 2004 at 12:01:45PM -0600, Alan Robertson was heard to remark:
> Linas Vepstas wrote:
> >Hi,
> >
> >On Wed, Oct 13, 2004 at 09:12:23AM +0800, Zhen Huang was heard to remark:
> >
> >>Hi,
> >>
> >>The watchdog I mentioned means such a device:
> >>Once we open it we must write to it regularly. 
> >>Otherwise the whole system will be reset.
> >>
> >>Many OS have software implement of this.
> >>But the software watchdog will depend on the health of the OS.
> >>
> >>I want to know whether there have any hardware implement in pServer.
> >
> >
> >Yes, there is a hardware watchdog; its implemented on all pSeries
> >machines that have service processors (thus, it goes back to at
> >least power3).  However, it is not a unix 'device' that a user-land 
> >process can 'open'; it is only accessible through RTAS calls.  The 
> >kernel daemon rtasd provides the regular heartbeat.
> >
> >The kernel enables the watchdog function with the 'enable_surveillance()' 
> >subroutine call (see arch/ppc64/kernel/rtasd.c).
> >Once its enabled, the heartbeat is the 'event-scan' RTAS call,
> >which the kernel must call regularly from each CPU.  (I guess this
> >helps detect hung CPU's on SMP systems).  If the event-scan call 
> >isn't made within the 'surveillance timeout', the SP will reboot 
> >the OS (or call in a service request, etc.)
> >
> >I don't know if there is any interest in moving this heartbeat 
> >watchdog out from kernel space into user space; right now, 
> >rtasd is a kernel daemon, and it more or less just works.
> >
> >iIf it ever is converted to userland, its not likely it will 
> >every be a traditional unix device; instead, functions like 
> >this are moving to the sysfs file system.
> 
> This would be a logical equivalent to the well-known and long-standing 
> 'softdog' device driver which already has a well-known API, which is also 
> implemented on other hardware devices and architectures.
> 
> So, my suggestion would be that if it were moved to a userspace driver, 
> that the softdog API be retained.

I might have volunteered to hack this up real quick, were it not for
Mike Strosaker's correction, that the surveillance featues were taken
out of Power5.   

Anyone on this list know why?

--linas



More information about the Linuxppc64-dev mailing list