<div dir="ltr">I'm going to resurrect this thread for the new year.<div><br></div><div>It sounds like there is a decent need for some type of aggregator.  Would anyone be interested in setting up a meeting to try and synthesize our use cases into some broadly applicable requirements?</div><div><br></div><div>I'm located on the West Coast, but I have a pretty flexible schedule for other time zones next week.</div><div><br></div><div>Some topics for us to discuss (either in a meeting or offline) include:</div><div><br></div><div>1) Layer 2/3 discovery and negotiation</div><div>2) Caching, proxy, and consistency requirements</div><div>3) Target hardware, performance requirements, and scale of aggregation</div><div>4) Tooling and infrastructure improvements needed to support an aggregator</div><div>5) Amount of configuration and knowledge an aggregator needs to know a priori.</div><div><br></div><div>Any ideas on what else we can cover?  Is there a preferred format or medium that would work best to gather these higher level requirements?</div><div><br></div><div></div><div>Regards,</div><div>Richard</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Dec 19, 2019 at 2:17 AM vishwa <<a href="mailto:vishwa@linux.vnet.ibm.com">vishwa@linux.vnet.ibm.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
  
    
  
  <div bgcolor="#FFFFFF">
    <p>Richard, <br>
    </p>
    <p>Thanks for putting it together.<br>
    </p>
    <div>On 12/13/19 1:32 AM, Richard Hanley
      wrote:<br>
    </div>
    <blockquote type="cite">
      
      <div dir="ltr">In our case we are working to migrate away from
        IPMI to Redfish.  Most of the solutions I've been thinking about
        have leaned pretty heavily into that.
        <div><br>
        </div>
        <div>In my mind I've sliced this project up into a few different
          areas.
          <div><br>
          </div>
          <div><b>Merging/Transforming Redfish Resources</b></div>
          <div>Let's say that there are several Redfish services.  They
            will have collections of Systems, Chassis, and Managers
            that need to be merged.  In the simplest uses this would be
            just an HTTP proxy cache with some URL cleaning.</div>
          <div><br>
          </div>
          <div>However, this could end up being a pretty deep merge in
            cases where some resources are split across multiple
            management domains.  Memory errors being on one node, but
            the temperature sensor being on a separate node is a good
            example. Another example would be the "ContainedBy" link. 
            These links might reach across different BMC boundaries, and
            would need to be inserted by the primary node. </div>
          <div><br>
          </div>
          <div><b>Aggregating Services and Actions</b></div>
          <div>This is where I think the DMTF proposals for Redfish
            aggregation (located <a href="https://members.dmtf.org/apps/org/workgroup/redfish/document.php?document_id=91811" target="_blank">here</a>) provide
            the most insight.  My reading of this proposal is that an
            aggregation service would be used to tie actions together. 
            For example, there may be individual chassis reset action
            embedded in the chassis resources, and then aggregated
            action for a full reset.</div>
          <div><br>
          </div>
          <div>DMTF seems to be leaving the arbiter of the aggregation
            up to the implementation.  I'd imagine that some
            implementations would provide a static aggregation service,
            while others would allow clients to create their own dynamic
            aggregates.</div>
          <div><b><br>
            </b></div>
          <div><b>Discovery, Negotiation, and Error Recovery</b></div>
          <div>This is an area where I'd like to hear more about your
            requirements, Vishwa.  Would you expect the BMC cluster to
            be hot-swappable?  Is there a particular reason that it has
            to be peer to peer? What kind of error recovery should be
            supported when a node fails? </div>
          <div><br>
          </div>
          <div>At a high level, the idea that has been suggested
            internally is to have a designated master node at install
            time.  That node would discover any other Redfish services
            on the LAN, and begin aggregating them.  The master node
            would keep any in memory cache of the other services, and
            reload resources on demand.  If a node goes down, then there
            error is propagated using HTTP return codes.  If the master
            node goes down, then the entire aggregate will go down.  In
            theory a client could talk to individual nodes if it needed
            to.</div>
          <div><b><br>
            </b></div>
        </div>
      </div>
    </blockquote>
    <p>Case-1:<br>
      .......<br>
    </p>
    <p>Consider a hypothetical case where I have 4 compute nodes, each
      having BMC in it and that BMC is responsible for initiating
      power-on and other services for that node / getting the debug data
      out of that node / etc...</p>
    <p>We would want an external Management Console(MC) to manage this
      rack. Instead of going to 4 nodes separately, MC can ask 1 BMC
      that I am calling as "Point Of Contact" BMC / Primary BMC for that
      rack. It is the job of that BMC to do whatever is needed to return
      the result.</p>
    <p>Similarly, when the POC goes down, we would need another POC.</p>
    <p>I believe, Redfish discovery can be used to discover each BMCs.
      But how does the heart beat work between discovered BMCs ?<br>
      Also, when the POC goes down, how can we sense that and make some
      other BMC as POC using Redfish framework ?</p>
    <p><br>
      Case-2:<br>
      .......</p>
    <p>I have a control node that is housing 2 BMCs. One can be Primary
      and other can be Slave. Each BMC has the complete view of the
      whole systems. <br>
    </p>
    <p>I am assuming, we could still discover the other BMC using
      Redfish.. But again, how do we exchange heartbeat and do failover
      operations ?</p>
    <p>Thanks,</p>
    <p>!! Vishwa !!<br>
    </p>
    <blockquote type="cite">
      <div dir="ltr">
        <div>
          <div><b> Authentication and Authorization</b></div>
        </div>
        <div>This is an area where I think Redfish is a little hands
          off.  In an ideal world ACLs could be setup without
          proliferating username/passwords across nodes.  As an aside,
          we've been thinking about how to use Redfish without any
          usernames or passwords.  By using a combination of
          certificates and authorization tokens it should be possible to
          extend a security zone to a small cluster of BMCs.</div>
        <div><br>
        </div>
        <div>Regards,</div>
        <div>Richard</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Wed, Dec 11, 2019 at 11:33
          PM Neeraj Ladkani <<a href="mailto:neladk@microsoft.com" target="_blank">neladk@microsoft.com</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div lang="EN-US">
            <div>
              <p class="MsoNormal"><span style="color:windowtext">Sure,
                  how do we want to enable BMC-BMC communication?
                  Standard redfish/IPMI ?
                </span></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span></p>
              <p class="MsoNormal"><span style="color:windowtext">Neeraj</span></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span></p>
              <p class="MsoNormal"><span style="color:windowtext"> </span></p>
              <div>
                <div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in">
                  <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> vishwa <<a href="mailto:vishwa@linux.vnet.ibm.com" target="_blank">vishwa@linux.vnet.ibm.com</a>>
                      <br>
                      <b>Sent:</b> Wednesday, December 11, 2019 10:59 PM<br>
                      <b>To:</b> Neeraj Ladkani <<a href="mailto:neladk@microsoft.com" target="_blank">neladk@microsoft.com</a>><br>
                      <b>Cc:</b> <a href="mailto:openbmc@lists.ozlabs.org" target="_blank">openbmc@lists.ozlabs.org</a>;
                      <a href="mailto:sgundura@in.ibm.com" target="_blank">sgundura@in.ibm.com</a>;
                      <a href="mailto:kusripat@in.ibm.com" target="_blank">kusripat@in.ibm.com</a>;
                      <a href="mailto:shahjsha@in.ibm.com" target="_blank">shahjsha@in.ibm.com</a>;
                      <a href="mailto:vikantan@in.ibm.com" target="_blank">vikantan@in.ibm.com</a>;
                      Richard Hanley <<a href="mailto:rhanley@google.com" target="_blank">rhanley@google.com</a>><br>
                      <b>Subject:</b> Re: [EXTERNAL] Re: Managing
                      heterogeneous systems</span></p>
                </div>
              </div>
              <p class="MsoNormal"> </p>
              <div>
                <p class="MsoNormal">On 12/10/19 3:20 PM, Neeraj Ladkani
                  wrote:</p>
              </div>
              <blockquote style="margin-top:5pt;margin-bottom:5pt">
                <p class="MsoNormal"><span style="color:rgb(0,32,96)">Great
                    discussion. </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)">The
                    problem is not physical interface as they can
                    communicate using LAN. The problem is entity binding
                    as one compute node can be connected to 1 or more
                    storage nodes. How can we have one view of system
                    from operational perspective? Power on/off, SEL
                    logs, telemetry? </span></p>
              </blockquote>
              <div>
                <p class="MsoNormal"><span style="color:windowtext"> </span></p>
              </div>
              <div>
                <p class="MsoNormal"><span style="color:windowtext"><br>
                    Correct. This is where I mentioned about "Primary
                    BMC acting as Point Of Contact" for external
                    requests.<br>
                    Depending on how we want to service the request, we
                    could orchestrate that via PoC BMC or respond to
                    external requesters on where they can get the data
                    and they connect to 'em directly.</span></p>
              </div>
              <div>
                <p class="MsoNormal" style="margin-bottom:12pt"><span style="color:windowtext"><br>
                    !! Vishwa !!</span></p>
              </div>
              <blockquote style="margin-top:5pt;margin-bottom:5pt">
                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)">Some
                    of problems :</span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>
                <ol style="margin-top:0in" start="1" type="1">
                  <li style="color:rgb(0,32,96);margin-left:0in">
                    Power operations : Power/resets/ need to be
                    coordinated in all nodes in a system </li>
                  <li style="color:rgb(0,32,96);margin-left:0in">
                    Telemetry : OS runs only on head node so if there
                    are requests to read telemetry, it should get
                    telemetry ( SEL logs, Sensor Values ) from all the
                    nodes.
                  </li>
                  <li style="color:rgb(0,32,96);margin-left:0in">
                    Firmware Update</li>
                  <li style="color:rgb(0,32,96);margin-left:0in">
                    RAS: Memory errors are logged by UEFI SMM in to head
                    node but corresponding DIMM temperature , inlet
                    temperature are logged on secondary node which are
                    not mapped.  </li>
                </ol>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)">I
                    have been exploring couple of routes
                  </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>
                <ol style="margin-top:0in" start="1" type="1">
                  <li style="color:rgb(0,32,96);margin-left:0in">
                    LUN  discovery and routing: this is similar to IPMI
                    but I am working on architecture to extend this to
                    support multiple LUNs and route them from Head node.
                    ( we would need LUN routing over LAN )
                  </li>
                  <li style="color:rgb(0,32,96);margin-left:0in">
                    Redfish hierarchy for systems </li>
                </ol>
                <pre><span style="color:black">   "Systems": {</span></pre>
                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/Systems"</span></pre>
                <pre><span style="color:black">    },</span></pre>
                <pre><span style="color:black">    "Chassis": {</span></pre>
                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/Chassis"</span></pre>
                <pre><span style="color:black">    },</span></pre>
                <pre><span style="color:black">    "Managers": {</span></pre>
                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/Managers"</span></pre>
                <pre><span style="color:black">    },</span></pre>
                <pre><span style="color:black">    "AccountService": {</span></pre>
                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/AccountService"</span></pre>
                <pre><span style="color:black">    },</span></pre>
                <pre><span style="color:black">    "SessionService": {</span></pre>
                <pre><span style="color:black">        "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/SessionService"</span></pre>
                <pre><span style="color:black">    },</span></pre>
                <pre><span style="color:black">    "Links": {</span></pre>
                <pre><span style="color:black">        "Sessions": {</span></pre>
                <pre><span style="color:black">            "@<a href="http://odata.id" target="_blank">odata.id</a>": "/redfish/v1/SessionService/Sessions"</span></pre>
                <pre><span style="color:black">        }</span></pre>
                <pre style="margin-left:0.5in"><span>3.<span style="font:7pt "Times New Roman"">  </span></span><span style="font-family:Calibri,sans-serif;color:rgb(0,32,96)">Custom Messaging over LAN ( PubSub)</span></pre>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)">I
                    am also working on a whitepaper on same area
                  </span><span style="font-family:Wingdings;color:rgb(0,32,96)">J</span><span style="color:rgb(0,32,96)">.  Happy to work with you
                    guys if you have any ideas on how can we standardize
                    this.
                  </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)"> </span></p>
                <p class="MsoNormal"><span style="color:rgb(0,32,96)">Neeraj</span></p>
                <p class="MsoNormal"><span style="color:windowtext"> </span></p>
                <div>
                  <div style="border-right:none;border-bottom:none;border-left:none;border-top:1pt solid rgb(225,225,225);padding:3pt 0in 0in">
                    <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> vishwa
                        <a href="mailto:vishwa@linux.vnet.ibm.com" target="_blank"><vishwa@linux.vnet.ibm.com></a>
                        <br>
                        <b>Sent:</b> Tuesday, December 10, 2019 1:00 AM<br>
                        <b>To:</b> Richard Hanley <a href="mailto:rhanley@google.com" target="_blank"><rhanley@google.com></a>;
                        Neeraj Ladkani
                        <a href="mailto:neladk@microsoft.com" target="_blank"><neladk@microsoft.com></a><br>
                        <b>Cc:</b> <a href="mailto:openbmc@lists.ozlabs.org" target="_blank">openbmc@lists.ozlabs.org</a>;
                        <a href="mailto:sgundura@in.ibm.com" target="_blank">sgundura@in.ibm.com</a>;
                        <a href="mailto:kusripat@in.ibm.com" target="_blank">
                          kusripat@in.ibm.com</a>; <a href="mailto:shahjsha@in.ibm.com" target="_blank">shahjsha@in.ibm.com</a>;
                        <a href="mailto:vikantan@in.ibm.com" target="_blank">vikantan@in.ibm.com</a><br>
                        <b>Subject:</b> [EXTERNAL] Re: Managing
                        heterogeneous systems</span></p>
                  </div>
                </div>
                <p class="MsoNormal"> </p>
                <p>Hi Richard / Neeraj,</p>
                <p>Thanks for bringing this up. It's one of the
                  interesting topic for IBM.</p>
                <p>Some of the thoughts here.....</p>
                <p>When we have multiple BMCs as part of a single
                  system, then there are 3 main parts into it.</p>
                <p>1/. Discovering the peer BMCs and role assignment<br>
                  2/. Monitoring the existence of peer BMCs - heartbeat
                  <br>
                  3/. In the event of loosing the master, detect so
                  using #2 and then reassign the role</p>
                <p>Depending on how we want to establish the roles, we
                  could have Single-Master, Many-slave or Multi-Master,
                  Multi-Slave. etc</p>
                <p>One of the team here is trying to do a POC for Multi
                  BMC architecture and is still in the very beginning
                  stage.
                  <br>
                  The team is currently studying/evaluating the
                  available solution - Corosync / Heartbeat /
                  Pacemaker".<br>
                  Corosync works nice with the clusters, but we need to
                  see if we can trim it down for BMC.<br>
                  <br>
                  If we can not use corosync for some reason, then need
                  to see if we can use the discovery using PLDM (
                  probably use the terminus IDs )<br>
                  and come up with custom rules for assigning
                  Master-Slave roles.</p>
                <p>If we choose to have Single-Master and Many-Slave, we
                  could have that Single-Master as an entity acting as a
                  Point of Contact for external request and then could
                  orchestrate with the needed BMCs internally to get the
                  job done</p>
                <p>I will be happy to know if there are alternatives
                  that suit BMC kind of an architecture</p>
                <p>!! Vishwa !!</p>
                <div>
                  <p class="MsoNormal">On 12/10/19 4:32 AM, Richard
                    Hanley wrote:</p>
                </div>
                <blockquote style="margin-top:5pt;margin-bottom:5pt">
                  <div>
                    <p class="MsoNormal">Hi Neeraj, </p>
                    <div>
                      <p class="MsoNormal"> </p>
                    </div>
                    <div>
                      <p class="MsoNormal">This is an open question that
                        I've been looking into as well.  </p>
                    </div>
                    <div>
                      <p class="MsoNormal"> </p>
                    </div>
                    <div>
                      <p class="MsoNormal">For BMC to BMC communication
                        there are a few options.</p>
                    </div>
                    <div>
                      <ol start="1" type="1">
                        <li class="MsoNormal">
                          If you have network connectivity you can
                          communicate using Redfish.</li>
                        <li class="MsoNormal">
                          If you only have a PCIe connection, you'll
                          have to use either the inband connection or
                          the side band I2C*.  PLDM and MCTP are
                          protocols that defined to handle this use
                          case, although I'm not sure if the OpenBMC
                          implementations have been used in production.</li>
                        <li class="MsoNormal">
                          There is always IPMI, which has its own
                          pros/cons.</li>
                      </ol>
                      <div>
                        <p class="MsoNormal">For taking several BMCs and
                          aggregating them into a single logical
                          interface that is exposed to the outside
                          world, there are a few things happening on
                          that front.  DMTF has been working on an
                          aggregation protocol for Redfish.  However,
                          it's my understanding that their proposal is
                          more directed at the client level, as opposed
                          to within a single "system".</p>
                      </div>
                    </div>
                    <div>
                      <p class="MsoNormal"> </p>
                    </div>
                    <div>
                      <p class="MsoNormal">I just recently joined the
                        community, but I've been thinking about how a
                        proxy layer could merge two Redfish services
                        together.  Since Redfish is fairly strongly
                        typed and has a well defined mechanism for OEM
                        extensions, this should be pretty generally
                        applicable.  I am planning on having a white
                        paper on the issue sometime after the holidays.</p>
                    </div>
                    <div>
                      <p class="MsoNormal"> </p>
                    </div>
                    <div>
                      <p class="MsoNormal">Another thing to note,
                        recently DMTF released a spec for running a
                        binary Redfish over PLDM called RDE.  That might
                        be a useful way of tying all these concepts
                        together.  </p>
                    </div>
                    <div>
                      <p class="MsoNormal"> </p>
                    </div>
                    <div>
                      <p class="MsoNormal">I'd be curious about your
                        thoughts and use cases here.  Would either PLDM
                        or Redfish fit your use case?</p>
                    </div>
                    <div>
                      <p class="MsoNormal"> </p>
                    </div>
                    <div>
                      <p class="MsoNormal">Regards,</p>
                    </div>
                    <div>
                      <p class="MsoNormal">Richard</p>
                    </div>
                    <div>
                      <p class="MsoNormal"> </p>
                    </div>
                    <div>
                      <p class="MsoNormal">*I've heard of some proposals
                        that run a network interface over PCIe.  I don't
                        know enough about PCIe to know if this is a good
                        idea.</p>
                    </div>
                  </div>
                  <p class="MsoNormal"> </p>
                  <div>
                    <div>
                      <p class="MsoNormal">On Mon, Dec 9, 2019 at 1:27
                        PM Neeraj Ladkani <<a href="mailto:neladk@microsoft.com" target="_blank">neladk@microsoft.com</a>>
                        wrote:</p>
                    </div>
                    <blockquote style="border-top:none;border-right:none;border-bottom:none;border-left:1pt solid rgb(204,204,204);padding:0in 0in 0in 6pt;margin:5pt 0in 5pt 4.8pt">
                      <div>
                        <div>
                          <p class="MsoNormal">Are there any standards
                            in managing heterogeneous systems? For
                            example in a rack if there is a compute
                            node( with its own BMC) and storage node(
                            with its own BMC) connected using a PCIe
                            switch.  How these two BMC represented as
                            one system ?  are there any standards for
                            BMC – BMC communication?
                          </p>
                          <p class="MsoNormal"> </p>
                          <p class="MsoNormal"> </p>
                          <p class="MsoNormal">Neeraj</p>
                          <p class="MsoNormal"> </p>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                </blockquote>
              </blockquote>
            </div>
          </div>
        </blockquote>
      </div>
    </blockquote>
  </div>

</blockquote></div>