[RFC] BMC RAS Feature
Lei Yu
yulei.sh at bytedance.com
Wed Mar 22 18:10:31 AEDT 2023
> > On Tue, 21 Mar 2023 at 20:38, Supreeth Venkatesh
> > <supreeth.venkatesh at amd.com> wrote:
> >
> >
> > On 3/21/23 05:40, Patrick Williams wrote:
> > > On Tue, Mar 21, 2023 at 12:14:45AM -0500, Supreeth Venkatesh wrote:
> > >
> > >> #### Alternatives Considered
> > >>
> > >> In-band mechanisms using System Management Mode (SMM) exists.
> > >>
> > >> However, out of band method to gather RAS data is processor
> > specific.
> > >>
> > > How does this compare with existing implementations in
> > > phosphor-debug-collector.
> > Thanks for your feedback. See below.
> > > I believe there was some attempt to extend
> > > P-D-C previously to handle Intel's crashdump behavior.
> > Intel's crashdump interface uses com.intel.crashdump.
> > We have implemented com.amd.crashdump based on that reference.
> > However,
> > can this be made generic?
> >
> > PoC below:
> >
> > busctl tree com.amd.crashdump
> >
> > └─/com
> > └─/com/amd
> > └─/com/amd/crashdump
> > ├─/com/amd/crashdump/0
> > ├─/com/amd/crashdump/1
> > ├─/com/amd/crashdump/2
> > ├─/com/amd/crashdump/3
> > ├─/com/amd/crashdump/4
> > ├─/com/amd/crashdump/5
> > ├─/com/amd/crashdump/6
> > ├─/com/amd/crashdump/7
> > ├─/com/amd/crashdump/8
> > └─/com/amd/crashdump/9
> >
> > > The repository
> > > currently handles IBM's processors, I think, or maybe that is
> > covered by
> > > openpower-debug-collector.
> > >
> > > In any case, I think you should look at the existing D-Bus
> > interfaces
> > > (and associated Redfish implementation) of these repositories and
> > > determine if you can use those approaches (or document why now).
> > I could not find an existing D-Bus interface for RAS in
> > xyz/openbmc_project/.
> > It would be helpful if you could point me to it.
> >
> >
> > There is an interface for the dumps generated from the host, which can
> > be used for these kinds of dumps
> > https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml
> >
> > The fault log also provides similar dumps
> > https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml
> >
> ThanksDdhruvraj. The interface looks useful for the purpose. However,
> the current BMCWEB implementation references
> https://github.com/openbmc/bmcweb/blob/master/redfish-core/lib/log_services.hpp
> [com.intel.crashdump]
> constexpr char const* crashdumpPath = "/com/intel/crashdump";
>
> constexpr char const* crashdumpInterface = "com.intel.crashdump";
> constexpr char const* crashdumpObject = "com.intel.crashdump";
>
> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml
> or
> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml
> is it exercised in Redfish logservices?
In our practice, a plugin `tools/dreport.d/plugins.d/acddump` is added
to copy the crashdump json file to the dump tarball.
The crashdump tool (Intel or AMD) could trigger a dump after the
crashdump is completed, and then we could get a dump entry containing
the crashdump.
--
BRs,
Lei YU
More information about the openbmc
mailing list