Platform telemetry and health monitoring

Neeraj Ladkani neladk at microsoft.com
Wed Jun 12 16:28:26 AEST 2019


Thanks Kun for summarizing notes.

For detailed notes: https://github.com/openbmc/openbmc/wiki/Platform-telemetry-and-health-monitoring-Work-Group

Neeraj

From: openbmc <openbmc-bounces+neladk=microsoft.com at lists.ozlabs.org> On Behalf Of Kun Yi
Sent: Tuesday, June 11, 2019 11:24 AM
To: Alexander Amelkin <a.amelkin at yadro.com>
Cc: OpenBMC Maillist <openbmc at lists.ozlabs.org>
Subject: Re: Platform telemetry and health monitoring

Neeraj mentioned he will send out the meeting minutes. He will also look into setting up a wiki page holding the contents as well as minutes.

A few quick notes from top of my head from the kick-off meeting:
- did a round table, all the orgs have similar requirements
- need to look into how existing infra fit into the needs and what falls short
- will have workstreams for:
    - what to collect
    - how to collect
    - how to store
    - how to export
- collectd sounds interesting and promising for collecting metrics
- IPMI SELs have limitations as an event reporting mechanism, possibly need to have a new events or error log reporting mechanism to aggregate fault logs from different components
- will need to look into Redish and expand the specs as necessary to fit our needs

On Tue, Jun 11, 2019 at 2:02 AM Alexander Amelkin <a.amelkin at yadro.com<mailto:a.amelkin at yadro.com>> wrote:
I second the idea of reusing collectd. It's pretty standard and popular.

With best regards,
Alexander Amelkin,
Leading BMC Software Engineer, YADRO
https://yadro.com<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fyadro.com&data=02%7C01%7Cneladk%40microsoft.com%7C68576e41c62f48257c3208d6ee9a4618%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636958743709581575&sdata=07bNIbF1qKhYghVOuFGX6yR40k2JZjkoJoxmJKKoj4Y%3D&reserved=0>

05.06.2019 15:49, Brad Bishop wrote:
> On Tue, Jun 04, 2019 at 12:35:05PM -0700, Kun Yi wrote:
>> FYI: Srinivas, Neeraj, and I are finalizing a time slot for the kick off
>> meeting. We are thinking about a bi-weekly discussion.
>>
>> Also, I'm drafting a version of BMC metrics collection daemon. The first
>> draft is up on https://gerrit.openbmc-project.xyz/c/openbmc/docs/+/22257<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgerrit.openbmc-project.xyz%2Fc%2Fopenbmc%2Fdocs%2F%2B%2F22257&data=02%7C01%7Cneladk%40microsoft.com%7C68576e41c62f48257c3208d6ee9a4618%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636958743709581575&sdata=f3K51xlCmMdDN9FYzNG1cBiSUQpdy9LzcC8Aj%2BBoQHs%3D&reserved=0>,
>> which we probably will go over during the meeting.
>
> I just wanted to point out the collectd project:  https://collectd.org/<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcollectd.org%2F&data=02%7C01%7Cneladk%40microsoft.com%7C68576e41c62f48257c3208d6ee9a4618%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636958743709591574&sdata=4n9Bot1mYk90yyVe7lr3BAQLpBqDjdYqyQu3cw0%2F%2FD8%3D&reserved=0>
>
> I'm not sure if it is suitable or not but it seems like a pretty close match to what you are trying to do and it would be a lot of code you don't have to write.
>
> Just something to consider.
>
> thx - brad


--
Regards,
Kun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20190612/65012b05/attachment-0001.htm>


More information about the openbmc mailing list