[RFC] BMC RAS Feature

Lei Yu yulei.sh at bytedance.com
Wed Mar 22 18:10:31 AEDT 2023


> > On Tue, 21 Mar 2023 at 20:38, Supreeth Venkatesh
> > <supreeth.venkatesh at amd.com> wrote:
> >
> >
> >     On 3/21/23 05:40, Patrick Williams wrote:
> >     > On Tue, Mar 21, 2023 at 12:14:45AM -0500, Supreeth Venkatesh wrote:
> >     >
> >     >> #### Alternatives Considered
> >     >>
> >     >> In-band mechanisms using System Management Mode (SMM) exists.
> >     >>
> >     >> However, out of band method to gather RAS data is processor
> >     specific.
> >     >>
> >     > How does this compare with existing implementations in
> >     > phosphor-debug-collector.
> >     Thanks for your feedback. See below.
> >     > I believe there was some attempt to extend
> >     > P-D-C previously to handle Intel's crashdump behavior.
> >     Intel's crashdump interface uses com.intel.crashdump.
> >     We have implemented com.amd.crashdump based on that reference.
> >     However,
> >     can this be made generic?
> >
> >     PoC below:
> >
> >     busctl tree com.amd.crashdump
> >
> >     └─/com
> >        └─/com/amd
> >          └─/com/amd/crashdump
> >            ├─/com/amd/crashdump/0
> >            ├─/com/amd/crashdump/1
> >            ├─/com/amd/crashdump/2
> >            ├─/com/amd/crashdump/3
> >            ├─/com/amd/crashdump/4
> >            ├─/com/amd/crashdump/5
> >            ├─/com/amd/crashdump/6
> >            ├─/com/amd/crashdump/7
> >            ├─/com/amd/crashdump/8
> >            └─/com/amd/crashdump/9
> >
> >     > The repository
> >     > currently handles IBM's processors, I think, or maybe that is
> >     covered by
> >     > openpower-debug-collector.
> >     >
> >     > In any case, I think you should look at the existing D-Bus
> >     interfaces
> >     > (and associated Redfish implementation) of these repositories and
> >     > determine if you can use those approaches (or document why now).
> >     I could not find an existing D-Bus interface for RAS in
> >     xyz/openbmc_project/.
> >     It would be helpful if you could point me to it.
> >
> >
> > There is an interface for the dumps generated from the host, which can
> > be used for these kinds of dumps
> > https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml
> >
> > The fault log also provides similar dumps
> > https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml
> >
> ThanksDdhruvraj. The interface looks useful for the purpose. However,
> the current BMCWEB implementation references
> https://github.com/openbmc/bmcweb/blob/master/redfish-core/lib/log_services.hpp
> [com.intel.crashdump]
> constexpr char const* crashdumpPath = "/com/intel/crashdump";
>
> constexpr char const* crashdumpInterface = "com.intel.crashdump";
> constexpr char const* crashdumpObject = "com.intel.crashdump";
>
> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml
> or
> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml
> is it exercised in Redfish logservices?

In our practice, a plugin `tools/dreport.d/plugins.d/acddump` is added
to copy the crashdump json file to the dump tarball.
The crashdump tool (Intel or AMD) could trigger a dump after the
crashdump is completed, and then we could get a dump entry containing
the crashdump.


-- 
BRs,
Lei YU


More information about the openbmc mailing list