<div dir="ltr">I'm going to resurrect this thread for the new year.<div><br></div><div>It sounds like there is a decent need for some type of aggregator.  Would anyone be interested in setting up a meeting to try and synthesize our use cases into some broadly applicable requirements?</div><div><br></div><div>I'm located on the West Coast, but I have a pretty flexible schedule for other time zones next week.</div><div><br></div><div>Some topics for us to discuss (either in a meeting or offline) include:</div><div><br></div><div>1) Layer 2/3 discovery and negotiation</div><div>2) Caching, proxy, and consistency requirements</div><div>3) Target hardware, performance requirements, and scale of aggregation</div><div>4) Tooling and infrastructure improvements needed to support an aggregator</div><div>5) Amount of configuration and knowledge an aggregator needs to know a priori.</div><div><br></div><div>Any ideas on what else we can cover?  Is there a preferred format or medium that would work best to gather these higher level requirements?</div><div><br></div><div></div><div>Regards,</div><div>Richard</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 19, 2019 at 2:17 AM vishwa <<a href="mailto:vishwa@linux.vnet.ibm.com">vishwa@linux.vnet.ibm.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

  <div bgcolor="#FFFFFF">

    <p>Richard, <br>

    </p>

    <p>Thanks for putting it together.<br>

    </p>

    <div>On 12/13/19 1:32 AM, Richard Hanley

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr">In our case we are working to migrate away from

        IPMI to Redfish.  Most of the solutions I've been thinking about

        have leaned pretty heavily into that.

        <div><br>

        </div>

        <div>In my mind I've sliced this project up into a few different

          areas.

          <div><br>

          </div>

          <div><b>Merging/Transforming Redfish Resources</b></div>

          <div>Let's say that there are several Redfish services.  They

            will have collections of Systems, Chassis, and Managers

            that need to be merged.  In the simplest uses this would be

            just an HTTP proxy cache with some URL cleaning.</div>

          <div><br>

          </div>

          <div>However, this could end up being a pretty deep merge in

            cases where some resources are split across multiple

            management domains.  Memory errors being on one node, but

            the temperature sensor being on a separate node is a good

            example. Another example would be the "ContainedBy" link. 

            These links might reach across different BMC boundaries, and

            would need to be inserted by the primary node. </div>

          <div><br>

          </div>

          <div><b>Aggregating Services and Actions</b></div>

          <div>This is where I think the DMTF proposals for Redfish

            aggregation (located <a href="https://members.dmtf.org/apps/org/workgroup/redfish/document.php?document_id=91811" target="_blank">here</a>) provide

            the most insight.  My reading of this proposal is that an

            aggregation service would be used to tie actions together. 

            For example, there may be individual chassis reset action

            embedded in the chassis resources, and then aggregated

            action for a full reset.</div>

          <div><br>

          </div>

          <div>DMTF seems to be leaving the arbiter of the aggregation

            up to the implementation.  I'd imagine that some

            implementations would provide a static aggregation service,

            while others would allow clients to create their own dynamic

            aggregates.</div>

          <div><b><br>

            </b></div>

          <div><b>Discovery, Negotiation, and Error Recovery</b></div>

          <div>This is an area where I'd like to hear more about your

            requirements, Vishwa.  Would you expect the BMC cluster to

            be hot-swappable?  Is there a particular reason that it has

            to be peer to peer? What kind of error recovery should be

            supported when a node fails? </div>

          <div><br>

          </div>

          <div>At a high level, the idea that has been suggested

            internally is to have a designated master node at install

            time.  That node would discover any other Redfish services

            on the LAN, and begin aggregating them.  The master node

            would keep any in memory cache of the other services, and

            reload resources on demand.  If a node goes down, then there

            error is propagated using HTTP return codes.  If the master

            node goes down, then the entire aggregate will go down.  In

            theory a client could talk to individual nodes if it needed

            to.</div>

          <div><b><br>

            </b></div>

        </div>

      </div>

    </blockquote>

    <p>Case-1:<br>

      .......<br>

    </p>

    <p>Consider a hypothetical case where I have 4 compute nodes, each

      having BMC in it and that BMC is responsible for initiating

      power-on and other services for that node / getting the debug data

      out of that node / etc...</p>

    <p>We would want an external Management Console(MC) to manage this

      rack. Instead of going to 4 nodes separately, MC can ask 1 BMC

      that I am calling as "Point Of Contact" BMC / Primary BMC for that

      rack. It is the job of that BMC to do whatever is needed to return

      the result.</p>

    <p>Similarly, when the POC goes down, we would need another POC.</p>

    <p>I believe, Redfish discovery can be used to discover each BMCs.

      But how does the heart beat work between discovered BMCs ?<br>

      Also, when the POC goes down, how can we sense that and make some

      other BMC as POC using Redfish framework ?</p>

    <p><br>

      Case-2:<br>

      .......</p>

    <p>I have a control node that is housing 2 BMCs. One can be Primary

      and other can be Slave. Each BMC has the complete view of the

      whole systems. <br>

    </p>

    <p>I am assuming, we could still discover the other BMC using

      Redfish.. But again, how do we exchange heartbeat and do failover

      operations ?</p>

    <p>Thanks,</p>

    <p>!! Vishwa !!<br>

    </p>

    <blockquote type="cite">

      <div dir="ltr">

        <div>

          <div><b> Authentication and Authorization</b></div>

        </div>

        <div>This is an area where I think Redfish is a little hands

          off.  In an ideal world ACLs could be setup without

          proliferating username/passwords across nodes.  As an aside,

          we've been thinking about how to use Redfish without any

          usernames or passwords.  By using a combination of

          certificates and authorization tokens it should be possible to

          extend a security zone to a small cluster of BMCs.</div>

        <div><br>

        </div>

        <div>Regards,</div>

        <div>Richard</div>

      </div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Wed, Dec 11, 2019 at 11:33

          PM Neeraj Ladkani <<a href="mailto:neladk@microsoft.com" target="_blank">neladk@microsoft.com</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

          <div lang="EN-US">

            <div>

              <p class="MsoNormal"><span style="color:windowtext">Sure,

                  how do we want to enable BMC-BMC communication?

                  Standard redfish/IPMI ?

                </span></p>

              <p class="MsoNormal"><span style="color:windowtext"> </span></p>

              <p class="MsoNormal"><span style="color:windowtext">Neeraj</span></p>

              <p class="MsoNormal"><span style="color:windowtext"> </span></p>

              <p class="MsoNormal"><span style="color:windowtext"> </span></p>

              <div>

                <div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in">

                  <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> vishwa <<a href="mailto:vishwa@linux.vnet.ibm.com" target="_blank">vishwa@linux.vnet.ibm.com</a>>

                      <br>

                      <b>Sent:</b> Wednesday, December 11, 2019 10:59 PM<br>

                      <b>To:</b> Neeraj Ladkani <<a href="mailto:neladk@microsoft.com" target="_blank">neladk@microsoft.com</a>><br>

                      <b>Cc:</b> <a href="mailto:openbmc@lists.ozlabs.org" target="_blank">openbmc@lists.ozlabs.org</a>;

                      <a href="mailto:sgundura@in.ibm.com" target="_blank">sgundura@in.ibm.com</a>;

                      <a href="mailto:kusripat@in.ibm.com" target="_blank">kusripat@in.ibm.com</a>;

                      <a href="mailto:shahjsha@in.ibm.com" target="_blank">shahjsha@in.ibm.com</a>;

                      <a href="mailto:vikantan@in.ibm.com" target="_blank">vikantan@in.ibm.com</a>;

                      Richard Hanley <<a href="mailto:rhanley@google.com" target="_blank">rhanley@google.com</a>><br>

                      <b>Subject:</b> Re: [EXTERNAL] Re: Managing

                      heterogeneous systems</span></p>

                </div>

              </div>

              <p class="MsoNormal"> </p>

              <div>

                <p class="MsoNormal">On 12/10/19 3:20 PM, Neeraj Ladkani

                  wrote:</p>

              </div>

              <blockquote style="margin-top:5pt;margin-bottom:5pt">

                <p class="MsoNormal"><span style="color:rgb(0,32,96)">Great

                    discussion. </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)">The

                    problem is not physical interface as they can

                    communicate using LAN. The problem is entity binding

                    as one compute node can be connected to 1 or more

                    storage nodes. How can we have one view of system

                    from operational perspective? Power on/off, SEL

                    logs, telemetry? </span></p>

              </blockquote>

              <div>

                <p class="MsoNormal"><span style="color:windowtext"> </span></p>

              </div>

              <div>

                <p class="MsoNormal"><span style="color:windowtext"><br>

                    Correct. This is where I mentioned about "Primary

                    BMC acting as Point Of Contact" for external

                    requests.<br>

                    Depending on how we want to service the request, we

                    could orchestrate that via PoC BMC or respond to

                    external requesters on where they can get the data

                    and they connect to 'em directly.</span></p>

              </div>

              <div>

                <p class="MsoNormal" style="margin-bottom:12pt"><span style="color:windowtext"><br>

                    !! Vishwa !!</span></p>

              </div>

              <blockquote style="margin-top:5pt;margin-bottom:5pt">

                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)">Some

                    of problems :</span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>

                <ol style="margin-top:0in" start="1" type="1">

                  <li style="color:rgb(0,32,96);margin-left:0in">

                    Power operations : Power/resets/ need to be

                    coordinated in all nodes in a system </li>

                  <li style="color:rgb(0,32,96);margin-left:0in">

                    Telemetry : OS runs only on head node so if there

                    are requests to read telemetry, it should get

                    telemetry ( SEL logs, Sensor Values ) from all the

                    nodes.

                  </li>

                  <li style="color:rgb(0,32,96);margin-left:0in">

                    Firmware Update</li>

                  <li style="color:rgb(0,32,96);margin-left:0in">

                    RAS: Memory errors are logged by UEFI SMM in to head

                    node but corresponding DIMM temperature , inlet

                    temperature are logged on secondary node which are

                    not mapped.  </li>

                </ol>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)">I

                    have been exploring couple of routes

                  </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>

                <ol style="margin-top:0in" start="1" type="1">

                  <li style="color:rgb(0,32,96);margin-left:0in">

                    LUN  discovery and routing: this is similar to IPMI

                    but I am working on architecture to extend this to

                    support multiple LUNs and route them from Head node.

                    ( we would need LUN routing over LAN )

                  </li>

                  <li style="color:rgb(0,32,96);margin-left:0in">

                    Redfish hierarchy for systems </li>

                </ol>

                <pre><span style="color:black">   "Systems": {</span></pre>

                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/Systems"</span></pre>

                <pre><span style="color:black">    },</span></pre>

                <pre><span style="color:black">    "Chassis": {</span></pre>

                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/Chassis"</span></pre>

                <pre><span style="color:black">    },</span></pre>

                <pre><span style="color:black">    "Managers": {</span></pre>

                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/Managers"</span></pre>

                <pre><span style="color:black">    },</span></pre>

                <pre><span style="color:black">    "AccountService": {</span></pre>

                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/AccountService"</span></pre>

                <pre><span style="color:black">    },</span></pre>

                <pre><span style="color:black">    "SessionService": {</span></pre>

                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/SessionService"</span></pre>

                <pre><span style="color:black">    },</span></pre>

                <pre><span style="color:black">    "Links": {</span></pre>

                <pre><span style="color:black">        "Sessions": {</span></pre>

                <pre><span style="color:black">            "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/SessionService/Sessions"</span></pre>

                <pre><span style="color:black">        }</span></pre>

                <pre style="margin-left:0.5in"><span>3.<span style="font:7pt "Times New Roman"">  </span></span><span style="font-family:Calibri,sans-serif;color:rgb(0,32,96)">Custom Messaging over LAN ( PubSub)</span></pre>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)">I

                    am also working on a whitepaper on same area

                  </span><span style="font-family:Wingdings;color:rgb(0,32,96)">J</span><span style="color:rgb(0,32,96)">.  Happy to work with you

                    guys if you have any ideas on how can we standardize

                    this.

                  </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>

                <p class="MsoNormal"><span style="color:rgb(0,32,96)">Neeraj</span></p>

                <p class="MsoNormal"><span style="color:windowtext"> </span></p>

                <div>

                  <div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in">

                    <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> vishwa

                        <a href="mailto:vishwa@linux.vnet.ibm.com" target="_blank"><vishwa@linux.vnet.ibm.com></a>

                        <br>

                        <b>Sent:</b> Tuesday, December 10, 2019 1:00 AM<br>

                        <b>To:</b> Richard Hanley <a href="mailto:rhanley@google.com" target="_blank"><rhanley@google.com></a>;

                        Neeraj Ladkani

                        <a href="mailto:neladk@microsoft.com" target="_blank"><neladk@microsoft.com></a><br>

                        <b>Cc:</b> <a href="mailto:openbmc@lists.ozlabs.org" target="_blank">openbmc@lists.ozlabs.org</a>;

                        <a href="mailto:sgundura@in.ibm.com" target="_blank">sgundura@in.ibm.com</a>;

                        <a href="mailto:kusripat@in.ibm.com" target="_blank">

                          kusripat@in.ibm.com</a>; <a href="mailto:shahjsha@in.ibm.com" target="_blank">shahjsha@in.ibm.com</a>;

                        <a href="mailto:vikantan@in.ibm.com" target="_blank">vikantan@in.ibm.com</a><br>

                        <b>Subject:</b> [EXTERNAL] Re: Managing

                        heterogeneous systems</span></p>

                  </div>

                </div>

                <p class="MsoNormal"> </p>

                <p>Hi Richard / Neeraj,</p>

                <p>Thanks for bringing this up. It's one of the

                  interesting topic for IBM.</p>

                <p>Some of the thoughts here.....</p>

                <p>When we have multiple BMCs as part of a single

                  system, then there are 3 main parts into it.</p>

                <p>1/. Discovering the peer BMCs and role assignment<br>

                  2/. Monitoring the existence of peer BMCs - heartbeat

                  <br>

                  3/. In the event of loosing the master, detect so

                  using #2 and then reassign the role</p>

                <p>Depending on how we want to establish the roles, we

                  could have Single-Master, Many-slave or Multi-Master,

                  Multi-Slave. etc</p>

                <p>One of the team here is trying to do a POC for Multi

                  BMC architecture and is still in the very beginning

                  stage.

                  <br>

                  The team is currently studying/evaluating the

                  available solution - Corosync / Heartbeat /

                  Pacemaker".<br>

                  Corosync works nice with the clusters, but we need to

                  see if we can trim it down for BMC.<br>

                  <br>

                  If we can not use corosync for some reason, then need

                  to see if we can use the discovery using PLDM (

                  probably use the terminus IDs )<br>

                  and come up with custom rules for assigning

                  Master-Slave roles.</p>

                <p>If we choose to have Single-Master and Many-Slave, we

                  could have that Single-Master as an entity acting as a

                  Point of Contact for external request and then could

                  orchestrate with the needed BMCs internally to get the

                  job done</p>

                <p>I will be happy to know if there are alternatives

                  that suit BMC kind of an architecture</p>

                <p>!! Vishwa !!</p>

                <div>

                  <p class="MsoNormal">On 12/10/19 4:32 AM, Richard

                    Hanley wrote:</p>

                </div>

                <blockquote style="margin-top:5pt;margin-bottom:5pt">

                  <div>

                    <p class="MsoNormal">Hi Neeraj, </p>

                    <div>

                      <p class="MsoNormal"> </p>

                    </div>

                    <div>

                      <p class="MsoNormal">This is an open question that

                        I've been looking into as well.  </p>

                    </div>

                    <div>

                      <p class="MsoNormal"> </p>

                    </div>

                    <div>

                      <p class="MsoNormal">For BMC to BMC communication

                        there are a few options.</p>

                    </div>

                    <div>

                      <ol start="1" type="1">

                        <li class="MsoNormal">

                          If you have network connectivity you can

                          communicate using Redfish.</li>

                        <li class="MsoNormal">

                          If you only have a PCIe connection, you'll

                          have to use either the inband connection or

                          the side band I2C*.  PLDM and MCTP are

                          protocols that defined to handle this use

                          case, although I'm not sure if the OpenBMC

                          implementations have been used in production.</li>

                        <li class="MsoNormal">

                          There is always IPMI, which has its own

                          pros/cons.</li>

                      </ol>

                      <div>

                        <p class="MsoNormal">For taking several BMCs and

                          aggregating them into a single logical

                          interface that is exposed to the outside

                          world, there are a few things happening on

                          that front.  DMTF has been working on an

                          aggregation protocol for Redfish.  However,

                          it's my understanding that their proposal is

                          more directed at the client level, as opposed

                          to within a single "system".</p>

                      </div>

                    </div>

                    <div>

                      <p class="MsoNormal"> </p>

                    </div>

                    <div>

                      <p class="MsoNormal">I just recently joined the

                        community, but I've been thinking about how a

                        proxy layer could merge two Redfish services

                        together.  Since Redfish is fairly strongly

                        typed and has a well defined mechanism for OEM

                        extensions, this should be pretty generally

                        applicable.  I am planning on having a white

                        paper on the issue sometime after the holidays.</p>

                    </div>

                    <div>

                      <p class="MsoNormal"> </p>

                    </div>

                    <div>

                      <p class="MsoNormal">Another thing to note,

                        recently DMTF released a spec for running a

                        binary Redfish over PLDM called RDE.  That might

                        be a useful way of tying all these concepts

                        together.  </p>

                    </div>

                    <div>

                      <p class="MsoNormal"> </p>

                    </div>

                    <div>

                      <p class="MsoNormal">I'd be curious about your

                        thoughts and use cases here.  Would either PLDM

                        or Redfish fit your use case?</p>

                    </div>

                    <div>

                      <p class="MsoNormal"> </p>

                    </div>

                    <div>

                      <p class="MsoNormal">Regards,</p>

                    </div>

                    <div>

                      <p class="MsoNormal">Richard</p>

                    </div>

                    <div>

                      <p class="MsoNormal"> </p>

                    </div>

                    <div>

                      <p class="MsoNormal">*I've heard of some proposals

                        that run a network interface over PCIe.  I don't

                        know enough about PCIe to know if this is a good

                        idea.</p>

                    </div>

                  </div>

                  <p class="MsoNormal"> </p>

                  <div>

                    <div>

                      <p class="MsoNormal">On Mon, Dec 9, 2019 at 1:27

                        PM Neeraj Ladkani <<a href="mailto:neladk@microsoft.com" target="_blank">neladk@microsoft.com</a>>

                        wrote:</p>

                    </div>

                    <blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">

                      <div>

                        <div>

                          <p class="MsoNormal">Are there any standards

                            in managing heterogeneous systems? For

                            example in a rack if there is a compute

                            node( with its own BMC) and storage node(

                            with its own BMC) connected using a PCIe

                            switch.  How these two BMC represented as

                            one system ?  are there any standards for

                            BMC – BMC communication?

                          </p>

                          <p class="MsoNormal"> </p>

                          <p class="MsoNormal"> </p>

                          <p class="MsoNormal">Neeraj</p>

                          <p class="MsoNormal"> </p>

                        </div>

                      </div>

                    </blockquote>

                  </div>

                </blockquote>

              </blockquote>

            </div>

          </div>

        </blockquote>

      </div>

    </blockquote>

  </div>

</blockquote></div>