multi-bmc openbmc systems

Andrew Geissler geissonator at gmail.com
Wed Aug 31 22:59:53 AEST 2022


Greetings,

Last week, the IBM team held an internal workshop discussing some potential future designs. One interesting design point is a single “system" that contains multiple compute sleds within it, each with their own BMC. The host portion of each sled would be cabled via Symmetric MultiProcessing cables (SMP) to the others. Each sled host would boot in in parallel, managed by its BMC. The host firmware would initially be booting in each compute sled independently until it reaches a point where it trains and starts the SMP. At that point the host firmware merges into a single instance across all of the cabled compute sleds, giving a system where multiple BMCs are associated with a single host. Further, there is no rack controller BMC in the current system design.

This design bring a lot of interesting technical challenges. A few that jumped out:
- Providing a consistent external interface (Redfish) to a client that allows them to manage the entire system.
- Electing a BMC from the pool as the "system coordinator", that will act as the point-of-contact for the whole system and deal with the details of the BMC-to-BMC communication
- Keeping common data like BIOS consistent across the BMC’s (one design point we’d like is that the host firmware is abstracted from the system coordinator BMC and can query it’s data from its own (pre-SMP) or any (post-SMP) BMC)
- Handling the failure of a BMC (removal from system, reconfiguration of SMP)
- Handling the failure of the BMC acting as the system coordinator (fail over to another BMC, service IP address to potentially follow)
- OpenBMC applications that will need to know the system configuration and execute commands on other BMC’s
- Debug tools like obmcutil that currently operate on a single BMC system concept

There has been some great work going on within bmcweb recently that starts to solve some of these issues. This [commit][1] introduced the idea of “satellite” BMC’s. bmcweb looks for some entity manager hosted objects that indicate configuration and login information for other BMC’s. If bmcweb sees those objects, it queries those BMC’s and return the appropriate data to the Redfish query.

Redfish has the concept of a [composable system][2] which is something that appears to make a lot of sense on us implementing (allowing a user to add/removed BMC sleds and perform common operations on system).

One design point we’d really like is that all communication from BMC to BMC is via Redfish, although for leader election and data consistency we were looking into something like [etcd][3].

As we walked through the different OpenBMC applications that would be affected by this system design (state management, host firmware failures, logging, code update, LED, SMP cable validation, …) we found some functions that could probably just be handled within bmcweb. But other functions, for example one to validate the SMP cabling is plugged properly, would contain a lot of very specific business logic that doesn't seem to belong within bmcweb. This logic would need to talk with all BMCs, read ID bits and GPIOs to ensure the SMP cables are plugged correctly. Should bmcweb provide D-Bus objects that this application could operate against to talk with other BMCs? Should the application do direct Redfish to the other BMCs? Should we just bury all the logic within bmcweb itself?

Whether the system coordinator implementation is one application or more, it felt like a reasonable design point was to isolate the code from the general BMC implementation. We called this isolated code the "System Coordinator Application" (SCA). The fact the SCA must be "mobile" in order to handle fail over scenarios started to lead us down design a point where the system coordinator logic is not within the regular bmcweb instance. Possibly a separate bmcweb-like application? Not sure here but a differentiation between bmcweb, which runs on all BMC’s, and a system coordinator application that only runs on one felt better (although it very well could just be another application within the bmcweb repo that utilizes some of the same files as bmcweb). When we discussed this on discord, it was a wait and see what the code looks like type direction.

This is a “what we’re thinking” type email and not meant to be a “this is the way it shall be”. We’re interested in any feedback from anyone who might have similar designs in the pipeline. Our goal is to start rolling out some design documents and work on some of the simpler code pieces to assess feasibility and things we have not thought of.

[1]: https://gerrit.openbmc.org/c/openbmc/bmcweb/+/53310 
[2]: https://redfish.dmtf.org/redfish/mockups/v1/1151
[3]: https://etcd.io/


More information about the openbmc mailing list