<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <p>Neeraj,</p>
    <p>Thanks for the inputs. It's nice to see us having a similar
      thought.</p>
    <p>AFAIK, we don't have any work-group that is driving <span
        style="color:windowtext">“Platform telemetry and health
        monitoring”. Also, do we want to see this as 2 different
        entities ?. In the past, there were thoughts about using
        websockets to channel some of the thermal parameters as
        telemetry data. But then it was not implemented.</span></p>
    <p><span style="color:windowtext">We can discuss here I think.</span></p>
    <p><span style="color:windowtext">!! Vishwa !!<br>
      </span></p>
    <div class="moz-cite-prefix">On 5/17/19 12:00 PM, Neeraj Ladkani
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:BL0PR2101MB093237A6F0A48C7212BD1FE0C80B0@BL0PR2101MB0932.namprd21.prod.outlook.com">
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <meta name="Generator" content="Microsoft Word 15 (filtered
        medium)">
      <style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
p.msonormal0, li.msonormal0, div.msonormal0
        {mso-style-name:msonormal;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;
        color:black;}
span.EmailStyle19
        {mso-style-type:personal;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
span.EmailStyle20
        {mso-style-type:personal-compose;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:209733967;
        mso-list-type:hybrid;
        mso-list-template-ids:370437354 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l0:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l0:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l0:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l1
        {mso-list-id:629939391;
        mso-list-type:hybrid;
        mso-list-template-ids:1528842280 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l1:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l1:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l1:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l1:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l2
        {mso-list-id:680204859;
        mso-list-type:hybrid;
        mso-list-template-ids:-205626744 67698713 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l2:level1
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:1.0in;
        text-indent:-.25in;}
@list l2:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:1.5in;
        text-indent:-.25in;}
@list l2:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:2.0in;
        text-indent:-9.0pt;}
@list l2:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:2.5in;
        text-indent:-.25in;}
@list l2:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:3.0in;
        text-indent:-.25in;}
@list l2:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:3.5in;
        text-indent:-9.0pt;}
@list l2:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:4.0in;
        text-indent:-.25in;}
@list l2:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:4.5in;
        text-indent:-.25in;}
@list l2:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:5.0in;
        text-indent:-9.0pt;}
@list l3
        {mso-list-id:1127358141;
        mso-list-type:hybrid;
        mso-list-template-ids:293792440 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l3:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l3:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l3:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l3:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l3:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l3:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l3:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l3:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l3:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l4
        {mso-list-id:1620336591;
        mso-list-type:hybrid;
        mso-list-template-ids:-1154200314 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l4:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l4:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l4:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l4:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l4:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l4:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l4:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l4:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l4:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l5
        {mso-list-id:1909806149;
        mso-list-type:hybrid;
        mso-list-template-ids:375685314 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l5:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l5:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l5:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l5:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l5:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l5:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l5:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l5:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l5:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l6
        {mso-list-id:2005165080;
        mso-list-type:hybrid;
        mso-list-template-ids:99010748 177007434 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l6:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:.75in;
        text-indent:-.25in;}
@list l6:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:1.25in;
        text-indent:-.25in;}
@list l6:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:1.75in;
        text-indent:-9.0pt;}
@list l6:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:2.25in;
        text-indent:-.25in;}
@list l6:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:2.75in;
        text-indent:-.25in;}
@list l6:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:3.25in;
        text-indent:-9.0pt;}
@list l6:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:3.75in;
        text-indent:-.25in;}
@list l6:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        margin-left:4.25in;
        text-indent:-.25in;}
@list l6:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        margin-left:4.75in;
        text-indent:-9.0pt;}
@list l7
        {mso-list-id:2022971887;
        mso-list-type:hybrid;
        mso-list-template-ids:1438651526 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l7:level1
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l7:level2
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l7:level3
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l7:level4
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l7:level5
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l7:level6
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
@list l7:level7
        {mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l7:level8
        {mso-level-number-format:alpha-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;}
@list l7:level9
        {mso-level-number-format:roman-lower;
        mso-level-tab-stop:none;
        mso-level-number-position:right;
        text-indent:-9.0pt;}
ol
        {margin-bottom:0in;}
ul
        {margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal"><span style="color:windowtext">At cloud
            scale, telemetry and health monitoring is very critical. We
            should define a framework that allows platform owners to add
            their own telemetry hooks. Telemetry service should be
            designed to make this data accessible and store in resilient
            way (like blackbox during plane crash).  <o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext">Is there any
            workgroup that drives this feature “Platform telemetry and
            health monitoring” ?
            <o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext">Wishlist<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext">BMC
            telemetry : <o:p></o:p></span></p>
        <ol style="margin-top:0in" start="1" type="1">
          <li class="MsoListParagraph"
            style="color:windowtext;margin-left:0in;mso-list:l7 level1
            lfo5">
            Linux subsystem<o:p></o:p></li>
          <ol style="margin-top:0in" start="1" type="a">
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              Uptime<o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              CPU Load average<o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              Memory info<o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              Storage usage ( RW )  <o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              Dmesg<o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              Syslog <o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              FDs of critical processes <o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              Alignment traps <o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              WDT excursions <o:p></o:p></li>
          </ol>
          <li class="MsoListParagraph"
            style="color:windowtext;margin-left:0in;mso-list:l7 level1
            lfo5">
            IPMI subsystem<o:p></o:p></li>
          <ol style="margin-top:0in" start="1" type="a">
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              Request and Response logging par interface with timestamps
              ( KCS, LAN, USB)<o:p></o:p></li>
            <li class="MsoListParagraph"
              style="color:windowtext;margin-left:0in;mso-list:l7 level2
              lfo5">
              Request and Response of IPMB<o:p></o:p></li>
          </ol>
        </ol>
        <p class="MsoListParagraph"
style="margin-left:1.5in;text-indent:-1.5in;mso-text-indent-alt:-9.0pt;mso-list:l7
          level3 lfo5">
          <!--[if !supportLists]--><span style="color:windowtext"><span
              style="mso-list:Ignore"><span style="font:7.0pt
                "Times New Roman"">                                                              
              </span>i.<span style="font:7.0pt "Times New
                Roman"">      </span></span></span><!--[endif]--><span
            style="color:windowtext">Request , Response, No of Retries<o:p></o:p></span></p>
        <ol style="margin-top:0in" start="3" type="1">
          <li class="MsoListParagraph"
            style="color:windowtext;margin-left:0in;mso-list:l7 level1
            lfo5">
            Misc<o:p></o:p></li>
        </ol>
        <ol style="margin-top:0in" start="1" type="a">
          <li class="MsoListParagraph"
            style="color:windowtext;mso-list:l2 level1 lfo8">Critical
            Temperature Excursions
            <o:p></o:p></li>
        </ol>
        <p class="MsoListParagraph"
style="margin-left:1.5in;text-indent:-1.5in;mso-text-indent-alt:-9.0pt;mso-list:l7
          level3 lfo5">
          <!--[if !supportLists]--><span style="color:windowtext"><span
              style="mso-list:Ignore"><span style="font:7.0pt
                "Times New Roman"">                                                              
              </span>i.<span style="font:7.0pt "Times New
                Roman"">      </span></span></span><!--[endif]--><span
            style="color:windowtext">Minimum Reading of Sensor<o:p></o:p></span></p>
        <p class="MsoListParagraph"
style="margin-left:1.5in;text-indent:-1.5in;mso-text-indent-alt:-9.0pt;mso-list:l7
          level3 lfo5">
          <!--[if !supportLists]--><span style="color:windowtext"><span
              style="mso-list:Ignore"><span style="font:7.0pt
                "Times New Roman"">                                                            
              </span>ii.<span style="font:7.0pt "Times New
                Roman"">      </span></span></span><!--[endif]--><span
            style="color:windowtext">Max Reading of a sensor<o:p></o:p></span></p>
        <p class="MsoListParagraph"
style="margin-left:1.5in;text-indent:-1.5in;mso-text-indent-alt:-9.0pt;mso-list:l7
          level3 lfo5">
          <!--[if !supportLists]--><span style="color:windowtext"><span
              style="mso-list:Ignore"><span style="font:7.0pt
                "Times New Roman"">                                                          
              </span>iii.<span style="font:7.0pt "Times New
                Roman"">      </span></span></span><!--[endif]--><span
            style="color:windowtext">Count of state transition<o:p></o:p></span></p>
        <p class="MsoListParagraph"
style="margin-left:1.5in;text-indent:-1.5in;mso-text-indent-alt:-9.0pt;mso-list:l7
          level3 lfo5">
          <!--[if !supportLists]--><span style="color:windowtext"><span
              style="mso-list:Ignore"><span style="font:7.0pt
                "Times New Roman"">                                                          
              </span>iv.<span style="font:7.0pt "Times New
                Roman"">      </span></span></span><!--[endif]--><span
            style="color:windowtext">Retry Count<o:p></o:p></span></p>
        <ol style="margin-top:0in" start="2" type="a">
          <li class="MsoListParagraph" style="mso-list:l2 level1 lfo8">Count
            of assertions/deassertions of GPIO and ability to capture
            the state<o:p></o:p></li>
          <li class="MsoListParagraph" style="mso-list:l2 level1 lfo8">timestamp
            of last assertion/deassertion of GPIO<o:p></o:p></li>
        </ol>
        <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext">Thanks<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext">~Neeraj<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="color:windowtext"><o:p> </o:p></span></p>
        <div>
          <div style="border:none;border-top:solid #E1E1E1
            1.0pt;padding:3.0pt 0in 0in 0in">
            <p class="MsoNormal"><b><span style="color:windowtext">From:</span></b><span
                style="color:windowtext"> openbmc
                <a class="moz-txt-link-rfc2396E" href="mailto:openbmc-bounces+neladk=microsoft.com@lists.ozlabs.org"><openbmc-bounces+neladk=microsoft.com@lists.ozlabs.org></a>
                <b>On Behalf Of </b>vishwa<br>
                <b>Sent:</b> Wednesday, May 8, 2019 1:11 AM<br>
                <b>To:</b> Kun Yi <a class="moz-txt-link-rfc2396E" href="mailto:kunyi@google.com"><kunyi@google.com></a>; OpenBMC
                Maillist <a class="moz-txt-link-rfc2396E" href="mailto:openbmc@lists.ozlabs.org"><openbmc@lists.ozlabs.org></a><br>
                <b>Subject:</b> Re: BMC health metrics (again!)<o:p></o:p></span></p>
          </div>
        </div>
        <p class="MsoNormal"><o:p> </o:p></p>
        <p>Hello Kun,<o:p></o:p></p>
        <p>Thanks for initiating it. I liked the /proc parsing. On the
          IPMI thing, is it only targeted to IPMI -or- a generic
          BMC-Host communication kink ?<o:p></o:p></p>
        <p>Some of the things in my wish-list are:<o:p></o:p></p>
        <p>1/. Flash wear and tear detection and the threshold to be a
          config option<br>
          2/. Any SoC specific health checks ( If that is exposed )<br>
          3/. Mechanism to detect spurious interrupts on any HW link<br>
          4/. Some kind of check to see if there will be any I2C lock to
          a given end device<br>
          5/. Ability to detect errors on HW links<o:p></o:p></p>
        <p>On the watchdog(8) area, I was just thinking these:<o:p></o:p></p>
        <p>How about having some kind of BMC_health D-Bus properties
          -or- a compile time feed, whose values can be fed into a
          configuration file than watchdog using the default
          /etc/watchdog.conf always. If the properties are coming from a
          D-Bus, then we could either append to /etc/watchdog.conf -or-
          treat those values only as the config file that can be given
          to watchdog.<br>
          The systemd service files to be setup accordingly.<o:p></o:p></p>
        <p><br>
          We have seen instances where we get an error that is
          indicating no resources available. Those could be file
          descriptors / socket descriptors etc. A way to plug this into
          watchdog as part of test binary that checks for this ? We
          could hook a repair-binary to take the action.<o:p></o:p></p>
        <p><br>
          Another thing that I was looking at hooking into watchdog is
          the test to see the file system usage as defined by the
          policy.<br>
          Policy could mention the file system mounts and also the
          threshold.<br>
          <br>
          For example, /tmp , /root etc.. We could again hook a repair
          binary to do some cleanup if needed<br>
          <br>
          If we see the list is growing with these custom requirements,
          then probably does not make sense to pollute the watchdog(2)
          but<br>
          have these consumed into the app instead ?<o:p></o:p></p>
        <p>!! Vishwa !!<o:p></o:p></p>
        <div>
          <p class="MsoNormal">On 4/9/19 9:55 PM, Kun Yi wrote:<o:p></o:p></p>
        </div>
        <blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
          <div>
            <div>
              <div>
                <div>
                  <p class="MsoNormal">Hello there,<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">This topic has been brought up
                    several times on the mailing list and offline, but
                    in general seems we as a community didn't reach a
                    consensus on what things would be the most valuable
                    to monitor, and how to monitor them. While it seems
                    a general purposed monitoring infrastructure for
                    OpenBMC is a hard problem, I have some simple ideas
                    that I hope can provide immediate and direct
                    benefits.<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">1. Monitoring host IPMI link
                    reliability (host side)<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">The essentials I want are "IPMI
                    commands sent" and "IPMI commands succeeded" counts
                    over time. More metrics like response time would
                    be helpful as well. The issue to address here: when
                    some IPMI sensor readings are flaky, it would be
                    really helpful to tell from IPMI command stats to
                    determine whether it is a hardware issue, or IPMI
                    issue. Moreover, it would be a very useful
                    regression test metric for rolling out new BMC
                    software.<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">Looking at the host IPMI side,
                    there is some metrics exposed
                    through /proc/ipmi/0/si_stats if ipmi_si driver is
                    used, but I haven't dug into whether it contains
                    information mapping to the interrupts. Time to read
                    the source code I guess.<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">Another idea would be to
                    instrument caller libraries like the interfaces in
                    ipmitool, though I feel that approach is harder due
                    to fragmentation of IPMI libraries.<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">2. Read and expose core BMC
                    performance metrics from procfs<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">This is straightforward: have a
                    smallish daemon (or bmc-state-manager) read,parse,
                    and process procfs and put values on D-Bus. Core
                    metrics I'm interested in getting through this way:
                    load average, memory, disk used/available, net
                    stats... The values can then simply be exported as
                    IPMI sensors or Redfish resource properties.<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">A nice byproduct of this effort
                    would be a procfs parsing library. Since different
                    platforms would probably have different monitoring
                    requirements and procfs output format has no
                    standard, I'm thinking the user would just provide a
                    configuration file containing list of (procfs path,
                    property regex, D-Bus property name), and the
                    compile-time generated code to provide an object for
                    each property. <o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">All of this is merely thoughts
                    and nothing concrete. With that said, it would be
                    really great if you could provide some feedback such
                    as "I want this, but I really need that feature", or
                    let me know it's all implemented already :)<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <div>
                  <p class="MsoNormal">If this seems valuable, after
                    gathering more feedback of feature requirements, I'm
                    going to turn them into design docs and upload for
                    review.<o:p></o:p></p>
                </div>
                <div>
                  <p class="MsoNormal"><o:p> </o:p></p>
                </div>
                <p class="MsoNormal">-- <o:p></o:p></p>
                <div>
                  <div>
                    <p class="MsoNormal">Regards, <o:p></o:p></p>
                    <div>
                      <p class="MsoNormal">Kun<o:p></o:p></p>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>