Hardware Watchdog Device in pSeries?
Linas Vepstas
linas at austin.ibm.com
Fri Oct 15 02:21:41 EST 2004
Hi Alan,
Long emails confuse me ...
On Wed, Oct 13, 2004 at 10:41:26PM -0600, Alan Robertson was heard to remark:
> Linas Vepstas wrote:
> >why should someone buy 12 pci-card watchdogs, one for each partition,
> >chewing up 12 pci slots, when the pSeries is already capable of doing
>
> It looks really Rube Goldberg-ish (to say the least).
[...]
>
> The hardware watchdog timer is a 3rd party
> monitoring system, and therefore is likely to be reliable when the thing it
> is watching is sick -
Not sure where you're going with this; are you saying that
3rd-party watchdog PCI cards, one for each partition, is a
good idea, or a bad idea?
Would you rather have the OS monitoring done with
(a) watchdog PCI cards,
(b) with 'surveillance' done by firmware/hypervisor,
(c) or with some other method?
> The bootstrap loader should work much the
I guess I didn't get this exposition either. Although its nice to
know that boot was successful, I see boot as a whole lot less
important than monitoring the system once its gone 'online'. The boot
sequence can be monitored much more loosely, with a whole-lot less
complexity. The hypervisor knows when the OS boot sequence starts.
If the OS hasn't completely booted after, say, 10 minutes, then it
can call a human to look at the problem. I don't see why one needs
to heartbeat once a second during boot; that's hard to do and seems
un-neccessary. By contrast, I'd expect to turn on the once-per-second
heartbeat just before the system goes 'online' or 'critical'.
--linas
More information about the Linuxppc64-dev
mailing list