Adding support for custom SEL records

Deng Tyler tyler.sabdon at gmail.com
Fri Oct 21 01:39:49 AEDT 2022


Hi Lei:
    I encounter a SEL catch sync issue. If a SEL generated while ipmd
collecting all log enttry from file in SEL cache initialized progress then
the SEL wouldn't be exist in SEL cache. Do you ever encounter this issue?

Tyler

Lei Yu <yulei.sh at bytedance.com> 於 2022年10月20日 週四 晚上9:26寫道:

> On Thu, Oct 20, 2022 at 2:05 AM Bills, Jason M
> <jason.m.bills at linux.intel.com> wrote:
> >
> >
> >
> > On 10/19/2022 11:10 AM, Brad Bishop wrote:
> > > Thanks Jason
> > >
> > > On Wed, Oct 19, 2022 at 09:50:47AM -0600, Bills, Jason M wrote:
> > >
> > >> Intel had a requirement to support storing at least 4000 log entries.
> > >
>
> Bytedance has a requirement of 1000 log entries.
>
> > > Ok.  So is it fair to assume anyone using the DBus backend does not
> have
> > > this requirement?
> >
> > That is my assumption, yes.
> > >
> > >> At the time, we were able to get about 400 entries on D-Bus before
> > >> D-Bus performance became unusable.
> > >
> > > To anyone using the DBus backend - have you observed similar
> performance
> > > issues?
> > >
>
> We did hit the performance issue, specifically, it is extremely slow
> during BMC boot, when log-manager restore the log entries and put them
> on DBus.
> That's when I start the discussion about
> https://gerrit.openbmc.org/c/openbmc/phosphor-logging/+/52445 and
>
> https://lore.kernel.org/openbmc/CAGm54UHU9s0bTq-AR9tJunoX2Wa9tQ0PH_zWJ2QrYdR3SRqcvg@mail.gmail.com/
>
> Later we resolved the issue by:
> * Applying the patch
> https://gerrit.openbmc.org/c/openbmc/phosphor-objmgr/+/53904
> * Implement the SEL cache in ipmid that is already upstreamed
> * Improve the SEL cache by serialization (not upstreamed)
>
> Eventually we get fair performance on SEL handling (with 1000
> entries), it should handle 4000 as well.
>
> > > Jason is there a testcase or scenario I can execute to highlighht the
> > > issues you refer to concretely?  Maybe something like "create 4000
> sels,
> > > run ipmitool and see how long it takes?"
> >
> > To clarify, my understanding is the D-Bus performance issues were not
> > isolated to just IPMI.  All of D-Bus for every BMC service was impacted.
> >
> > If I remember correctly, Ed Tanous is who did the initial evaluation, so
> > he may have more detail.  But I think it was similar to what you
> > suggest: Create 4000 logs on D-Bus and check the performance.  This
> > could be done with ipmitool.
> > >
> > >> I'd also be curious about the reverse question.  Is there any benefit
> > >> to storing logs on D-Bus that makes it a better solution?
> > >
> > > Yes, this is exactly the question I've been trying to ask.  The answer
> > > seems only to be that the code is in meta-intel/intel-ipmi-oem - but
> > > that is easily fixed by moving the code to
> > > meta-phosphor/phosphor-host-ipmid.
> > >
> > >> At the risk of complicating things more (https://xkcd.com/927/),
> D-Bus
> > >> was the primary solution when Intel joined.  We created the rsyslog
> > >> approach because of the limitation imposed by D-Bus.  But I know there
> > >> are still those who don't like the rsyslog approach.  Is there a way
> > >> we can now get together and define a new logging solution that is
> > >> fully upstream and avoids the drawbacks of both existing solutions?
> > >
> > > I hope so, because doing that would make things a lot easier for our
> > > users adopting OpenBMC.
> >
> > My main requirements are to store many logs (at least 4000 was the
> > original number, but I can try to get an updated number if needed) and
> > have them persist across BMC reboots.
> >
> > We currently accomplish this using rsyslog to extract logs from the
> > journal and store them in a persistent text file.
> >
> > How is best to approach starting a new design discussion?  Should we
> > continue discussing in this thread?  Start a design doc review?
> > Something else?
> > >
> > > Thanks,
> > > brad
>
> I would like to add several notes (possibly limitations) about
> rsyslog's SEL in intel-ipmi-oem, please correct if I was wrong.
> * It handles the SELs from phosphor-sel-logger, mostly it only
> contains the threshold events.
> * It iterates the sel files, and convert the file content into SEL
> data every time on a request, which does not seem optimal
> * The "add sel entry" does not really add a sel log, it adds an event
> entry to Redfish instead.
> * With above behavior, it basically has two separate types of logs,
> SEL logs that are from rsyslog, and redfish event logs that are done
> by "add sel entry". Thus the implementation seems to only support SELs
> for sensor threshold events, but not for discrete sensors.
>
> In bytedance we need a "full" SEL feature that supports both
> thresholds and discrete sensors.
> The whole solution is based on the DBus logging, but it involves
> different repos (ipmid, phosphor-logging, fault-monitor). Part of the
> implementation is upstreamed but some are internal for now.
> I would like to share the details when I have bandwidth :)
>
> --
> BRs,
> Lei YU
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20221020/5dee8ff9/attachment-0001.htm>


More information about the openbmc mailing list