[RFC] BMC RAS Feature

Supreeth Venkatesh supreeth.venkatesh at amd.com
Thu Mar 23 11:07:24 AEDT 2023


On 3/22/23 02:10, Lei Yu wrote:
> Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding.
>
>
>>> On Tue, 21 Mar 2023 at 20:38, Supreeth Venkatesh
>>> <supreeth.venkatesh at amd.com> wrote:
>>>
>>>
>>>      On 3/21/23 05:40, Patrick Williams wrote:
>>>      > On Tue, Mar 21, 2023 at 12:14:45AM -0500, Supreeth Venkatesh wrote:
>>>      >
>>>      >> #### Alternatives Considered
>>>      >>
>>>      >> In-band mechanisms using System Management Mode (SMM) exists.
>>>      >>
>>>      >> However, out of band method to gather RAS data is processor
>>>      specific.
>>>      >>
>>>      > How does this compare with existing implementations in
>>>      > phosphor-debug-collector.
>>>      Thanks for your feedback. See below.
>>>      > I believe there was some attempt to extend
>>>      > P-D-C previously to handle Intel's crashdump behavior.
>>>      Intel's crashdump interface uses com.intel.crashdump.
>>>      We have implemented com.amd.crashdump based on that reference.
>>>      However,
>>>      can this be made generic?
>>>
>>>      PoC below:
>>>
>>>      busctl tree com.amd.crashdump
>>>
>>>      └─/com
>>>         └─/com/amd
>>>           └─/com/amd/crashdump
>>>             ├─/com/amd/crashdump/0
>>>             ├─/com/amd/crashdump/1
>>>             ├─/com/amd/crashdump/2
>>>             ├─/com/amd/crashdump/3
>>>             ├─/com/amd/crashdump/4
>>>             ├─/com/amd/crashdump/5
>>>             ├─/com/amd/crashdump/6
>>>             ├─/com/amd/crashdump/7
>>>             ├─/com/amd/crashdump/8
>>>             └─/com/amd/crashdump/9
>>>
>>>      > The repository
>>>      > currently handles IBM's processors, I think, or maybe that is
>>>      covered by
>>>      > openpower-debug-collector.
>>>      >
>>>      > In any case, I think you should look at the existing D-Bus
>>>      interfaces
>>>      > (and associated Redfish implementation) of these repositories and
>>>      > determine if you can use those approaches (or document why now).
>>>      I could not find an existing D-Bus interface for RAS in
>>>      xyz/openbmc_project/.
>>>      It would be helpful if you could point me to it.
>>>
>>>
>>> There is an interface for the dumps generated from the host, which can
>>> be used for these kinds of dumps
>>> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml
>>>
>>> The fault log also provides similar dumps
>>> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml
>>>
>> ThanksDdhruvraj. The interface looks useful for the purpose. However,
>> the current BMCWEB implementation references
>> https://github.com/openbmc/bmcweb/blob/master/redfish-core/lib/log_services.hpp
>> [com.intel.crashdump]
>> constexpr char const* crashdumpPath = "/com/intel/crashdump";
>>
>> constexpr char const* crashdumpInterface = "com.intel.crashdump";
>> constexpr char const* crashdumpObject = "com.intel.crashdump";
>>
>> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml
>> or
>> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml
>> is it exercised in Redfish logservices?
> In our practice, a plugin `tools/dreport.d/plugins.d/acddump` is added
> to copy the crashdump json file to the dump tarball.
> The crashdump tool (Intel or AMD) could trigger a dump after the
> crashdump is completed, and then we could get a dump entry containing
> the crashdump.
Thanks Lei Yu for your input. We are using Redfish to retrieve the CPER 
binary file which can then be passed through a plugin/script for 
detailed analysis.
In any case irrespective of whichever Dbus interface we use, we need a 
repository which will gather data from AMD processor via APML as per AMD 
design.
APML Spec: https://www.amd.com/system/files/TechDocs/57019-A0-PUB_3.00.zip
Can someone please help create bmc-ras or amd-debug-collector repository 
as there are instances of openpower-debug-collector repository used for 
Open Power systems?
>
>
> --
> BRs,
> Lei YU


More information about the openbmc mailing list