[RFC] BMC RAS Feature

Supreeth Venkatesh supreeth.venkatesh at amd.com
Wed Mar 22 04:25:44 AEDT 2023


On 3/21/23 11:26, dhruvaraj S wrote:
>
> 	
> Caution: This message originated from an External Source. Use proper 
> caution when opening attachments, clicking links, or responding.
>
>
>
>
> On Tue, 21 Mar 2023 at 20:38, Supreeth Venkatesh 
> <supreeth.venkatesh at amd.com> wrote:
>
>
>     On 3/21/23 05:40, Patrick Williams wrote:
>     > On Tue, Mar 21, 2023 at 12:14:45AM -0500, Supreeth Venkatesh wrote:
>     >
>     >> #### Alternatives Considered
>     >>
>     >> In-band mechanisms using System Management Mode (SMM) exists.
>     >>
>     >> However, out of band method to gather RAS data is processor
>     specific.
>     >>
>     > How does this compare with existing implementations in
>     > phosphor-debug-collector.
>     Thanks for your feedback. See below.
>     > I believe there was some attempt to extend
>     > P-D-C previously to handle Intel's crashdump behavior.
>     Intel's crashdump interface uses com.intel.crashdump.
>     We have implemented com.amd.crashdump based on that reference.
>     However,
>     can this be made generic?
>
>     PoC below:
>
>     busctl tree com.amd.crashdump
>
>     └─/com
>        └─/com/amd
>          └─/com/amd/crashdump
>            ├─/com/amd/crashdump/0
>            ├─/com/amd/crashdump/1
>            ├─/com/amd/crashdump/2
>            ├─/com/amd/crashdump/3
>            ├─/com/amd/crashdump/4
>            ├─/com/amd/crashdump/5
>            ├─/com/amd/crashdump/6
>            ├─/com/amd/crashdump/7
>            ├─/com/amd/crashdump/8
>            └─/com/amd/crashdump/9
>
>     > The repository
>     > currently handles IBM's processors, I think, or maybe that is
>     covered by
>     > openpower-debug-collector.
>     >
>     > In any case, I think you should look at the existing D-Bus
>     interfaces
>     > (and associated Redfish implementation) of these repositories and
>     > determine if you can use those approaches (or document why now).
>     I could not find an existing D-Bus interface for RAS in
>     xyz/openbmc_project/.
>     It would be helpful if you could point me to it.
>
>
> There is an interface for the dumps generated from the host, which can 
> be used for these kinds of dumps
> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml
>
> The fault log also provides similar dumps
> https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml
>
ThanksDdhruvraj. The interface looks useful for the purpose. However, 
the current BMCWEB implementation references
https://github.com/openbmc/bmcweb/blob/master/redfish-core/lib/log_services.hpp 
[com.intel.crashdump]
constexpr char const* crashdumpPath = "/com/intel/crashdump";

constexpr char const* crashdumpInterface = "com.intel.crashdump";
constexpr char const* crashdumpObject = "com.intel.crashdump";

https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/System.interface.yaml 
or 
https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/yaml/xyz/openbmc_project/Dump/Entry/FaultLog.interface.yaml 
is it exercised in Redfish logservices?

> The tree for the dump manager looks like this
> `-/xyz
>   `-/xyz/openbmc_project
>     `-/xyz/openbmc_project/dump
>       |-/xyz/openbmc_project/dump/bmc
>       | `-/xyz/openbmc_project/dump/bmc/entry
>       |   |-/xyz/openbmc_project/dump/bmc/entry/1
>       |   |-/xyz/openbmc_project/dump/bmc/entry/2
>       |   |-/xyz/openbmc_project/dump/bmc/entry/3
>       |   `-/xyz/openbmc_project/dump/bmc/entry/4
>       |-/xyz/openbmc_project/dump/faultlog
>       |-/xyz/openbmc_project/dump/hardware
>       |-/xyz/openbmc_project/dump/hostboot
>       |-/xyz/openbmc_project/dump/internal
>       | `-/xyz/openbmc_project/dump/internal/manager
>       |-/xyz/openbmc_project/dump/resource
>       |-/xyz/openbmc_project/dump/sbe
>       `-/xyz/openbmc_project/dump/system
>
>     There are references to com.intel.crashdump in bmcweb code, but the
>     interface itself does not exist in yaml/com/intel/
>     we can add com.amd.crashdump as a start or even come up with a new
>     generic Dbus interface.
>     As far as Redfish implementation is concerned, we are following the
>     specification.
>     redfish/v1/Systems/system/LogServices/Crashdump schema is being used.
>
>     {
>
>     "@odata.id <http://odata.id>":
>     "/redfish/v1/Systems/system/LogServices/Crashdump/Entries",
>     "@odata.type": "#LogEntryCollection.LogEntryCollection",
>     "Description": "Collection of Crashdump Entries",
>     "Members":
>       [
>     {"@odata.id <http://odata.id>":
>     "/redfish/v1/Systems/system/LogServices/Crashdump/Entries/0",
>     "@odata.type": "#LogEntry.v1_7_0.LogEntry",
>     "AdditionalDataURI":
>     "/redfish/v1/Systems/system/LogServices/Crashdump/Entries/0/ras-error0.cper",
>     "Created": "1970-1-1T0:4:12Z",
>     "DiagnosticDataType": "OEM",
>     "EntryType": "Oem",
>     "Id": "0",
>     "Name": "CPU Crashdump",
>     "OEMDiagnosticDataType": "APMLCrashdump"
>     },
>     {"@odata.id <http://odata.id>":
>     "/redfish/v1/Systems/system/LogServices/Crashdump/Entries/1",
>     "@odata.type": "#LogEntry.v1_7_0.LogEntry",
>     "AdditionalDataURI":
>     "/redfish/v1/Systems/system/LogServices/Crashdump/Entries/1/ras-error1.cper",
>     "Created": "1970-1-1T0:4:12Z",
>     "DiagnosticDataType": "OEM",
>     "EntryType": "Oem",
>     "Id": "1",
>     "Name": "CPU Crashdump",
>     "OEMDiagnosticDataType": "APMLCrashdump"
>     },
>     ],
>     "Members at odata.count": 2,
>     "Name": "Open BMC Crashdump Entries"}
>     >
>
>
>
> -- 
> --------------
> Dhruvaraj S


More information about the openbmc mailing list