Summarizing Meeting on BMC Aggregation
vishwa
vishwa at linux.vnet.ibm.com
Tue Jan 28 01:58:01 AEDT 2020
Missed mentioning this variant.
All the 4 nodes in the rack together form 1 Machine. So, a power-on
would mean, power-on all the nodes. Similarly, "Get the data" would
mean, "Get the data" from all the nodes.
From an external entity, there is ONE power-on. However, it needs to be
deciphered into 4 power-on, one per each BMC in the rack
Thanks,
!!Vishwa !!
On 1/27/20 3:19 PM, vishwa wrote:
> Hi Richard,
>
> Thanks for capturing and sharing the discussion here. If I am reading
> it all correct, it looks like the aggregator here is an external
> entity and not part of one of the BMCs in the domain. To somewhat
> relate, this is kind of an aggregator like Nagios. Did I get it correct ?
>
> The email mentions "data and control". Could you give an example
> solution on how below problem statements may be seen and executed by
> the proposed aggregator ?
>
> *Hypothetical Problems*:
>
> Case-1 : I have 4 Nodes in the rack, with each having a BMC inside,
> responsible for doing things for THAT node.
> I want to power-on all the nodes in the rack and I want to use RedFish
> from a Management console.
> Where is the aggregator in this setup and how is it orchestrated ?
>
> Case-2 : Some BMC fails to power on the container node and it needs to
> report the error back to the initiator.
>
> Thank you very much for taking this initiative,
>
> !! Vishwa !!
>
> On 1/17/20 1:45 AM, Richard Hanley wrote:
>> Hi everyone,
>>
>> We had a meeting today to talk about BMC aggregation. I wanted to
>> thank everyone who joined.
>>
>> Below is my summary of the topics we discussed, and some of the
>> action items I took from the meeting. Please let me know if there
>> was something important that I missed or miss-characterized.
>> ------------------------------------------------------------------------------------------------------
>>
>>
>> There is a strong need to aggregate data and control features from
>> multiple BMCs into a single uniform view of a "machine."
>>
>> The definition of a machine here is relatively opaque, but it can be
>> thought of as an atomic physical unit for management. A machine is
>> then split into multiple domains, each of which is managed by some
>> management controller (most cases it would be a BMC). There may be
>> some cases where a domain has multiple BMCs for redundancy.
>>
>> Domains are relatively close to each other physically. Sometimes they
>> will be in the same chassis/enclosure, while other cases they will be
>> in an adjacent tray.
>>
>> One key point that was discussed in this meeting was that the data
>> and transport of these domains is relatively unconstrained. Domains
>> may be connected to the aggregator via a LAN, but there is a
>> community need to support other transports like SMBus and PCIe.
>>
>> An aggregator will likely need to be split up into three layers:
>>
>> 1) The lowest layer would detect, import, and transform individual
>> domains into a common data model. We would need to provide a
>> specification for that data model and tooling for implementers to
>> create their own instance of a domain's data.
>>
>> 2) An aggregation layer would take the instances of these domain
>> level data models, and aggregate them into a single view or graph of
>> the system. This process could be relatively automated graph
>> manipulation.
>>
>> 3) A presentation layer would take that aggregate, and expose it to
>> the outside world. This presentation layer could be Redfish, but
>> there is some divergence on that (see below). Regardless, we would
>> need tooling to program against the data model for implementers to
>> modify their presentation layers as needed.
>>
>> There is fairly broad agreement that Layer 1 would need to support
>> multiple protocols including; Redfish, PLDM/MCTP, and legacy IPMI
>> systems. There would need to be support for creating custom drivers
>> for importing these various transports into a common data model.
>>
>> There is some diverging needs when it comes to the presentation
>> layer. Here at Google, we were planning to have the presentation
>> layer be primarily Redfish and the common data model would be more
>> Redfish focused. Neeraj pointed out that there are some needs for
>> other presentation layers besides Redfish.
>>
>> Some other design considerations include the hardware target for this
>> aggregator. This aggregator will have to run on an OpenBMC platform,
>> but Google has some need for an aggregator to run on host linux
>> machines for legacy platforms without an out of band connection.
>>
>> Another consideration is the security of this aggregator. The
>> aggregation layer will have the primary responsibility of
>> adjudicating authentication and authorization for the sub-ordinate
>> nodes.
>>
>> One of the key takeaways (for me anyways) from this meeting is that
>> there is a community interest in keeping this aggregator generic, and
>> not tied to closely to a particular protocol, transport, or
>> presentation layer. There was mention of the CIM data model that may
>> be appropriate for this situation.
>>
>> We will be having follow-up meetings because this project is going to
>> take some time to scope out and design. I will be researching prior
>> art for existing data models that we could build a presentation layer
>> off of.
>>
>> Regards,
>> Richard
>
More information about the openbmc
mailing list