Request to create repository google-ipmi-bmc-health

Vijay Khemka vijaykhemka at fb.com
Sat Oct 3 06:54:14 AEST 2020


Hi Sui,

On 10/1/20, 6:52 PM, "Sui Chen" <suichen at google.com> wrote:

    Hi Vijay,

    We can use whatever means that gets health monitoring done.
    I have the following questions on how to merge the proposed IPMI
    Blob-based implementation, google-ipmi-bmc-health (referred to as
    "IPMI health blob") with phosphor-health-monitor. The intent of having
    a separate "google-ipmi-bmc-health" was to avoid these questions:

    1) The IPMI health blob is a library, not a daemon, so after the IPMI
    health blob is added, phosphor-health-monitor will have both a library
    and a daemon. The user needs to have a way to configure it. What is
    the recommended way of doing this configuration?

Yes the same repo can generate library as well as daemon. Currently it is
configuring 2 metrics cpu and memory, we can add another entry like
IPMI blob and if it is there then only it will build ipmi blobs.

    2) We are sending a protocol buffer through the IPMI interface to the
    BMC, and the protocol buffer may be only used for the IPMI path and
    not anywhere else. Would there be any concerns on the usage of a
    protocol buffer here?

If I understand correctly, protocol buffer will be used by daemon who
Is responding to the IPMI request and connecting to this daemon via
library call, then it is completely restricted for the use of protocol buffer.
If you are passing protocol buffer to this daemon then we have to define
some policy here. 

    Other than these two things I think adding new metrics to
    phosphor-health-monitor should be manageable. I can start by trying to
    add the IPMI blob handler to phosphor-health-monitor; my first attempt
    might not look very elegant, but if we find answers to the two
    questions above, the merged result will look a lot better. Hopefully
    we can find a solution that works well for everyone.

I am looking forward to your patches

    Thanks,
    Sui

    On Thu, Oct 1, 2020 at 12:06 PM Vijay Khemka <vijaykhemka at fb.com> wrote:
    >
    > Hi Sui,
    >
    > On 9/30/20, 8:30 AM, "openbmc on behalf of Sui Chen" <openbmc-bounces+vijaykhemka=fb.com at lists.ozlabs.org on behalf of suichen at google.com> wrote:
    >
    >     Hello OpenBMC community,
    >
    >     We are working on an IPMI blob-based implementation of BMC health
    >     monitoring. We currently have an internal working prototype version
    >     and would like to upload it to this newly proposed repository,
    >     openbmc/google-ipmi-bmc-health .
    >
    > In my opinion, we can enhance existing health-monitor and add your features.
    >
    >     We are aware of existing BMC health monitoring designs such as:
    >     1. https://github.com/openbmc/phosphor-health-monitor and its
    >     documentation https://urldefense.proofpoint.com/v2/url?u=https-3A__gerrit.openbmc-2Dproject.xyz_c_openbmc_docs_-2B_31957&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=v9MU0Ki9pWnTXCWwjHPVgpnCR80vXkkcrIaqU7USl5g&m=Z-_Rsue1ZHBD_TgPw7EDIc8dh8E8o8dlUe8aKr7I5VA&s=HTKEM8tcIgwzwL4OQVP1Kcve6ZfnhSTohdwPmIrjwe4&e=
    >     2. https://urldefense.proofpoint.com/v2/url?u=https-3A__gerrit.openbmc-2Dproject.xyz_c_openbmc_docs_-2B_34766&d=DwIBaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=v9MU0Ki9pWnTXCWwjHPVgpnCR80vXkkcrIaqU7USl5g&m=Z-_Rsue1ZHBD_TgPw7EDIc8dh8E8o8dlUe8aKr7I5VA&s=EcxSrU1PC6Akfy1FR0wo-5TC_QvMld9SDT7pJAh5QcM&e=
    >
    >     Main differences between this implementation and existing ones are:
    >     - google-ipmi-bmc-health is implemented with the IPMI blob handler
    >     framework and exists as an IPMI blob handler, while
    >     phosphor-health-monitor runs as a daemon and exposes BMC health
    >     metrics on DBus in the same manner sensors are exposed.
    >
    > Is this going to be a library or daemon, Same health-monitor daemon can
    > Be enhanced to add these functionalities.
    >
    >     - This implementation does not check health metric values against
    >     thresholds or perform actions when thresholds are crossed.
    >
    > If you don't define threshold in configuration file, health-monitor will
    > also not monitor metrics defined.
    >
    >     - This implementation currently reports uptime, memory usage, free
    >     disk space, CPU time consumed by processes, and file descriptor stats.
    >
    > Same can be added as extra metrics. That was the goal of this repo as to
    > start with basic metrics and add more as required.
    >
    >     - This implementation does not read a configuration file yet. It
    >     always reads the hard-coded set of health metrics listed above.
    >
    > We can enable or disable certain metrics through this configuration file.
    >
    >     - This implementation does not post-process sensor readings such as
    >     compute the average CPU usage over a certain time window.
    >
    > Window size 1 can give latest data rather than averaged data.
    >
    >     As such, this implementation differs enough from existing ones such
    >     that we believe we have enough reasons to have a separate repository
    >     for it.
    >
    > I will strongly prefer to add all of the features in the existing repo.
    >
    >     Thanks!
    >



More information about the openbmc mailing list