[PATCH 1/8] pseries: phyp dump: Docmentation

Sat Jan 12 03:57:51 EST 2008

On 10/01/2008, Nathan Lynch <ntl at pobox.com> wrote:
> Mike Strosaker wrote:
> >
> > At the risk of repeating what others have already said, the PHYP-assistance
> > method provides some advantages that the kexec method cannot:
> >  - Availability of the system for production use before the dump data is
> > collected.  As was mentioned before, some production systems may choose not
> > to operate with the limited memory initially available after the reboot,
> > but it sure is nice to provide the option.
>
> I'm more concerned that this design encourages the user to resume a
> workload *which is almost certainly known to result in a system crash*
> before collection of crash data is complete.  Maybe the gamble will
> pay off most of the time, but I wouldn't want to be working support
> when it doesn't.

Workloads that cause crashes within hours of startup tend to be
weeded-out/discovered during pre-production test of the system
to be deployed. Since its pre-production test, dumps can be
taken in a leisurely manner. Heck, even a session at the
xmon prompt can be contemplated.

The problem is when the crash only reproduces after days or
weeks of uptime, on a production machine.  Since the machine
is in production, its got to be brought back up ASAP.  Since
its crashing only after days/weeks, the dump should have
plenty of time to complete.  (And if it crashes quickly after
that reboot ... well, support people always welcome ways
in which a bug can be reproduced more quickly/easily).

--linas