<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>I kind of remember this topic being talked about in the past.
Looks like we need to do 2 things prior to calling SRESET. I will
comment the review.
</p>
<p>!! Vishwa !!</p>
<div class="moz-cite-prefix">On 5/27/19 12:45 PM, Jayanth Othayoth
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CACkAXSpuVmMWXwzPxBYrw4ZUpKHAgw_KtKboR6iVGyEuyWpVcg@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>Design template Review is available here<br>
</div>
<div><br>
</div>
<div><a
href="https://gerrit.openbmc-project.xyz/c/openbmc/docs/+/21772"
target="_blank" moz-do-not-send="true">https://gerrit.openbmc-project.xyz/c/openbmc/docs/+/21772</a></div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Thu, May 16, 2019 at 6:31
PM Andrew Geissler <<a href="mailto:geissonator@gmail.com"
target="_blank" moz-do-not-send="true">geissonator@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On
Thu, May 16, 2019 at 1:36 AM Deepak Kodihalli<br>
<<a href="mailto:dkodihal@linux.vnet.ibm.com"
target="_blank" moz-do-not-send="true">dkodihal@linux.vnet.ibm.com</a>>
wrote:<br>
><br>
> On 15/05/19 6:09 PM, Jayanth Othayoth wrote:<br>
> > ## Problem Description<br>
> > Issue #457: Add support to debug unresponsive host.<br>
> ><br>
> > Scope: High level design direction to solve this
problem,<br>
> ><br>
> > ## Background and References<br>
> > There are situation at customer places where
OPAL/Linux goes<br>
> > unresponsive causing a system hang. And there is no
way to figure out<br>
> > what went wrong with Linux kernel or OPAL. Looking
for a way to trigger<br>
> > a dump capture on Linux host so that we can capture
the OS dump for post<br>
> > analysis.<br>
> ><br>
> > ## Proposed Design for POWER processor based
systems:<br>
> > Get all Host CPUs in reset vector and Linux then has
a mechanism to<br>
> > patch it into panic-kdump path to trigger dump
capture. This will enable<br>
> > us to analyze and fix customer issue where we see
Linux hang and<br>
> > unresponsive system.<br>
> ><br>
> > ### Redfish Schema used:<br>
> > * Reference: DSP2046 2018.3,<br>
> > * ComputerSystem 1.6.0 schema provides an action
called<br>
> > #ComputerSystem.Reset”, This action is used to reset
the system.<br>
> > ResetType parameter is used for indicating type of
reset need to be<br>
> > performed. In this use case we can use “Nmi” type<br>
> > * Nmi: Generate a Diagnostic Interrupt (usually
an NMI on x86<br>
> > systems) to cease normal operations, perform
diagnostic actions and<br>
> > typically halt the system.<br>
> > * ### d-bus :<br>
> ><br>
> > Option 1: Extending the existing d-bus
interface state.Host name<br>
> > space (<br>
> >
/openbmc/phosphor-dbus-interfaces/xyz/openbmc_project/State/Host.interface.yaml<br>
> > ) to support new RequestedHostTransition property
called “Nmi”. d-bus<br>
> > backend can internally invoke processor specific
target to do Sreset(<br>
> > equivalent to x86 NMI) and associated actions.<br>
><br>
> I don't prefer this option, because this would mean
adding host specific<br>
> code in phoshor-state-manager, which I think until now is
host agnostic.<br>
<br>
Yeah, this was my main concern with tying it into
phosphor-state-manager.<br>
The fact Redfish put it in with their other state related
commands (which<br>
are implemented by phosphor-state-manager) is the only reason
I'm a little<br>
wishy-washy here. We could just create a generic systemd
target "host-nmi"<br>
or something and phosphor-state-manager could just call that
to abstract<br>
any of the specifics, but it sill doesn't really feel like it
fits to me.<br>
<br>
I think I prefer option 2, and then we can just map bmcweb to
that API when<br>
the Redfish command comes in. Sounds like for ppc64 systems we
can just<br>
use pdbg to issue the NMI.<br>
<br>
> So for that reason, Option 2 sounds better. There are
some good<br>
> questions from Neeraj as well, so I would suggest adding
this as a<br>
> design template on Gerrit to gather better feedback.<br>
><br>
> Thanks,<br>
> Deepak<br>
><br>
> > Option 2: Introducing new d-bus interface in the
control.state namespace<br>
> > (<br>
> >
/openbmc/phosphor-dbus-interfaces/xyz/openbmc_project/Control/Host/NMI.interface.yaml)<br>
> > namespace and implement the new d-bus back-end for
respective processor<br>
> > specific targets.<br>
><br>
</blockquote>
</div>
</blockquote>
</body>
</html>