<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0in;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:"Courier New";}
span.EmailStyle23
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:606087872;
mso-list-template-ids:1974789806;}
@list l1
{mso-list-id:729839584;
mso-list-type:hybrid;
mso-list-template-ids:-140484154 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l1:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l1:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l1:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l2
{mso-list-id:795609884;
mso-list-template-ids:-1364179524;}
@list l3
{mso-list-id:961033452;
mso-list-template-ids:-204698002;}
@list l4
{mso-list-id:1727483152;
mso-list-type:hybrid;
mso-list-template-ids:1314007658 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l4:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l4:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l4:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l4:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l4:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l4:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l4:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l4:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l4:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l5
{mso-list-id:2020548198;
mso-list-template-ids:-442833286;}
@list l5:level1
{mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level2
{mso-level-tab-stop:1.0in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level3
{mso-level-tab-stop:1.5in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level4
{mso-level-tab-stop:2.0in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level5
{mso-level-tab-stop:2.5in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level6
{mso-level-tab-stop:3.0in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level7
{mso-level-tab-stop:3.5in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level8
{mso-level-tab-stop:4.0in;
mso-level-number-position:left;
text-indent:-.25in;}
@list l5:level9
{mso-level-tab-stop:4.5in;
mso-level-number-position:left;
text-indent:-.25in;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="color:windowtext">Sure, how do we want to enable BMC-BMC communication? Standard redfish/IPMI ?
<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext">Neeraj<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> vishwa <vishwa@linux.vnet.ibm.com>
<br>
<b>Sent:</b> Wednesday, December 11, 2019 10:59 PM<br>
<b>To:</b> Neeraj Ladkani <neladk@microsoft.com><br>
<b>Cc:</b> openbmc@lists.ozlabs.org; sgundura@in.ibm.com; kusripat@in.ibm.com; shahjsha@in.ibm.com; vikantan@in.ibm.com; Richard Hanley <rhanley@google.com><br>
<b>Subject:</b> Re: [EXTERNAL] Re: Managing heterogeneous systems<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">On 12/10/19 3:20 PM, Neeraj Ladkani wrote:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span style="color:#002060">Great discussion. </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060">The problem is not physical interface as they can communicate using LAN. The problem is entity binding as one compute node can be connected to 1 or more storage nodes. How can we have one view of system from
operational perspective? Power on/off, SEL logs, telemetry? </span><o:p></o:p></p>
</blockquote>
<div>
<p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:windowtext"><br>
Correct. This is where I mentioned about "Primary BMC acting as Point Of Contact" for external requests.<br>
Depending on how we want to service the request, we could orchestrate that via PoC BMC or respond to external requesters on where they can get the data and they connect to 'em directly.<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span style="color:windowtext"><br>
!! Vishwa !!<o:p></o:p></span></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span style="color:#002060"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060">Some of problems :</span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060"> </span><o:p></o:p></p>
<ol style="margin-top:0in" start="1" type="1">
<li class="MsoListParagraph" style="color:#002060;margin-left:0in;mso-list:l1 level1 lfo3">
Power operations : Power/resets/ need to be coordinated in all nodes in a system <o:p>
</o:p></li><li class="MsoListParagraph" style="color:#002060;margin-left:0in;mso-list:l1 level1 lfo3">
Telemetry : OS runs only on head node so if there are requests to read telemetry, it should get telemetry ( SEL logs, Sensor Values ) from all the nodes.
<o:p></o:p></li><li class="MsoListParagraph" style="color:#002060;margin-left:0in;mso-list:l1 level1 lfo3">
Firmware Update<o:p></o:p></li><li class="MsoListParagraph" style="color:#002060;margin-left:0in;mso-list:l1 level1 lfo3">
RAS: Memory errors are logged by UEFI SMM in to head node but corresponding DIMM temperature , inlet temperature are logged on secondary node which are not mapped. <o:p></o:p></li></ol>
<p class="MsoNormal"><span style="color:#002060"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060">I have been exploring couple of routes
</span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060"> </span><o:p></o:p></p>
<ol style="margin-top:0in" start="1" type="1">
<li class="MsoListParagraph" style="color:#002060;margin-left:0in;mso-list:l4 level1 lfo6">
LUN discovery and routing: this is similar to IPMI but I am working on architecture to extend this to support multiple LUNs and route them from Head node. ( we would need LUN routing over LAN )
<o:p></o:p></li><li class="MsoListParagraph" style="color:#002060;margin-left:0in;mso-list:l4 level1 lfo6">
Redfish hierarchy for systems <o:p></o:p></li></ol>
<pre><span style="color:black"> "Systems": {</span><o:p></o:p></pre>
<pre><span style="color:black"> "@odata.id": "/redfish/v1/Systems"</span><o:p></o:p></pre>
<pre><span style="color:black"> },</span><o:p></o:p></pre>
<pre><span style="color:black"> "Chassis": {</span><o:p></o:p></pre>
<pre><span style="color:black"> "@odata.id": "/redfish/v1/Chassis"</span><o:p></o:p></pre>
<pre><span style="color:black"> },</span><o:p></o:p></pre>
<pre><span style="color:black"> "Managers": {</span><o:p></o:p></pre>
<pre><span style="color:black"> "@odata.id": "/redfish/v1/Managers"</span><o:p></o:p></pre>
<pre><span style="color:black"> },</span><o:p></o:p></pre>
<pre><span style="color:black"> "AccountService": {</span><o:p></o:p></pre>
<pre><span style="color:black"> "@odata.id": "/redfish/v1/AccountService"</span><o:p></o:p></pre>
<pre><span style="color:black"> },</span><o:p></o:p></pre>
<pre><span style="color:black"> "SessionService": {</span><o:p></o:p></pre>
<pre><span style="color:black"> "@odata.id": "/redfish/v1/SessionService"</span><o:p></o:p></pre>
<pre><span style="color:black"> },</span><o:p></o:p></pre>
<pre><span style="color:black"> "Links": {</span><o:p></o:p></pre>
<pre><span style="color:black"> "Sessions": {</span><o:p></o:p></pre>
<pre><span style="color:black"> "@odata.id": "/redfish/v1/SessionService/Sessions"</span><o:p></o:p></pre>
<pre><span style="color:black"> }</span><o:p></o:p></pre>
<pre style="margin-left:.5in;text-indent:-.25in;mso-list:l4 level1 lfo6"><![if !supportLists]><span style="mso-list:Ignore">3.<span style="font:7.0pt "Times New Roman""> </span></span><![endif]><span style="font-family:"Calibri",sans-serif;color:#002060">Custom Messaging over LAN ( PubSub)</span><o:p></o:p></pre>
<p class="MsoNormal"><span style="color:#002060"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060">I am also working on a whitepaper on same area
</span><span style="font-family:Wingdings;color:#002060">J</span><span style="color:#002060">. Happy to work with you guys if you have any ideas on how can we standardize this.
</span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#002060">Neeraj</span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:windowtext"> </span><o:p></o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span style="color:windowtext"> vishwa
<a href="mailto:vishwa@linux.vnet.ibm.com"><vishwa@linux.vnet.ibm.com></a> <br>
<b>Sent:</b> Tuesday, December 10, 2019 1:00 AM<br>
<b>To:</b> Richard Hanley <a href="mailto:rhanley@google.com"><rhanley@google.com></a>; Neeraj Ladkani
<a href="mailto:neladk@microsoft.com"><neladk@microsoft.com></a><br>
<b>Cc:</b> <a href="mailto:openbmc@lists.ozlabs.org">openbmc@lists.ozlabs.org</a>;
<a href="mailto:sgundura@in.ibm.com">sgundura@in.ibm.com</a>; <a href="mailto:kusripat@in.ibm.com">
kusripat@in.ibm.com</a>; <a href="mailto:shahjsha@in.ibm.com">shahjsha@in.ibm.com</a>;
<a href="mailto:vikantan@in.ibm.com">vikantan@in.ibm.com</a><br>
<b>Subject:</b> [EXTERNAL] Re: Managing heterogeneous systems</span><o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"> <o:p></o:p></p>
<p>Hi Richard / Neeraj,<o:p></o:p></p>
<p>Thanks for bringing this up. It's one of the interesting topic for IBM.<o:p></o:p></p>
<p>Some of the thoughts here.....<o:p></o:p></p>
<p>When we have multiple BMCs as part of a single system, then there are 3 main parts into it.<o:p></o:p></p>
<p>1/. Discovering the peer BMCs and role assignment<br>
2/. Monitoring the existence of peer BMCs - heartbeat <br>
3/. In the event of loosing the master, detect so using #2 and then reassign the role<o:p></o:p></p>
<p>Depending on how we want to establish the roles, we could have Single-Master, Many-slave or Multi-Master, Multi-Slave. etc<o:p></o:p></p>
<p>One of the team here is trying to do a POC for Multi BMC architecture and is still in the very beginning stage.
<br>
The team is currently studying/evaluating the available solution - Corosync / Heartbeat / Pacemaker".<br>
Corosync works nice with the clusters, but we need to see if we can trim it down for BMC.<br>
<br>
If we can not use corosync for some reason, then need to see if we can use the discovery using PLDM ( probably use the terminus IDs )<br>
and come up with custom rules for assigning Master-Slave roles.<o:p></o:p></p>
<p>If we choose to have Single-Master and Many-Slave, we could have that Single-Master as an entity acting as a Point of Contact for external request and then could orchestrate with the needed BMCs internally to get the job done<o:p></o:p></p>
<p>I will be happy to know if there are alternatives that suit BMC kind of an architecture<o:p></o:p></p>
<p>!! Vishwa !!<o:p></o:p></p>
<div>
<p class="MsoNormal">On 12/10/19 4:32 AM, Richard Hanley wrote:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal">Hi Neeraj, <o:p></o:p></p>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">This is an open question that I've been looking into as well. <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">For BMC to BMC communication there are a few options.<o:p></o:p></p>
</div>
<div>
<ol start="1" type="1">
<li class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l5 level1 lfo9">
If you have network connectivity you can communicate using Redfish.<o:p></o:p></li><li class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l5 level1 lfo9">
If you only have a PCIe connection, you'll have to use either the inband connection or the side band I2C*. PLDM and MCTP are protocols that defined to handle this use case, although I'm not sure if the OpenBMC implementations have been used in production.<o:p></o:p></li><li class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l5 level1 lfo9">
There is always IPMI, which has its own pros/cons.<o:p></o:p></li></ol>
<div>
<p class="MsoNormal">For taking several BMCs and aggregating them into a single logical interface that is exposed to the outside world, there are a few things happening on that front. DMTF has been working on an aggregation protocol for Redfish. However,
it's my understanding that their proposal is more directed at the client level, as opposed to within a single "system".<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">I just recently joined the community, but I've been thinking about how a proxy layer could merge two Redfish services together. Since Redfish is fairly strongly typed and has a well defined mechanism for OEM extensions, this should be
pretty generally applicable. I am planning on having a white paper on the issue sometime after the holidays.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Another thing to note, recently DMTF released a spec for running a binary Redfish over PLDM called RDE. That might be a useful way of tying all these concepts together. <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">I'd be curious about your thoughts and use cases here. Would either PLDM or Redfish fit your use case?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Regards,<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Richard<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">*I've heard of some proposals that run a network interface over PCIe. I don't know enough about PCIe to know if this is a good idea.<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal">On Mon, Dec 9, 2019 at 1:27 PM Neeraj Ladkani <<a href="mailto:neladk@microsoft.com" target="_blank">neladk@microsoft.com</a>> wrote:<o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Are there any standards in managing heterogeneous systems? For example in a rack if there is a compute node( with its own BMC) and storage node( with its own BMC) connected using
a PCIe switch. How these two BMC represented as one system ? are there any standards for BMC – BMC communication?
<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Neeraj<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
</div>
</blockquote>
</div>
</blockquote>
</blockquote>
</div>
</body>
</html>