Link phosphor-hostlogger and bmcweb

Nan Zhou nanzhou at google.com
Wed May 26 01:03:38 AEST 2021


>
> "we propose to remove the ring buffer and write to the log file as soon
> as some characters are received. This implicitly removes the needs of
> the ring buffer, and the persistence triggering (host reboot, sigterm,
> etc) in hostlogger"
> I would like to get a more detailed description of further changes in order
> to see the whole picture of the solution.

 Originally I didn't consider including changing the log file according to
the boot cycle; we will keep part of the dbus/signal watcher to make that
different reboot posts to a different log file.

We will work out some more detailed descriptions for future changes soon.

On Mon, May 24, 2021 at 11:41 PM Artem Senichev <artemsen at gmail.com> wrote:

> Sorry guys, maybe this is a misunderstanding on my part.
>
> I was confused with the following line in the proposal:
>
> "we propose to remove the ring buffer and write to the log file as soon
> as some characters are received. This implicitly removes the needs of
> the ring buffer, and the persistence triggering (host reboot, sigterm,
> etc) in hostlogger"
>
> I would like to get a more detailed description of further changes in order
> to see the whole picture of the solution.
>
> --
> Regards,
> Artem Senichev
> Software Engineer, YADRO.
>
>
> On Mon, May 24, 2021 at 09:27:39AM -0700, Ed Tanous wrote:
> > On Mon, May 24, 2021 at 12:52 AM Artem Senichev <artemsen at gmail.com>
> wrote:
> > >
> > > I'll try to convey the main idea that we tried to implement in this
> service.
> > >
> > > Hostlogger was originally designed to work with OpenPOWER systems,
> which
> > > generate a very detailed log at boot time.
> >
> > There are definitely other non OpenPOWER systems that have this same
> behavior.
> >
> > > It is important to save these logs and the host console output just
> before
> > > rebooting for further investigation of incidents when hardware errors
> occur.
> > > So, we have two log files for each server session (boot log + last OS
> messages).
> > > That's why we need a D-bus watcher.
> > > BMC flash has around 3MiB of free RW space, this force us to use
> compression
> > > and file rotation.
> > >
> > > All of these features are unnecessary for "streaming" real-time log
> recording.
> >
> > I disagree with you there.  Rotation and compression are still useful
> > in a "streaming" case.  Because of the ways the APIs are defined,
> > LogService in redfish provides both a "historical" version of past
> > logs.  It's useful to have those logs rotated and compressed.
> >
> > > You don't need DBus watchers, rotation can be done with native Linux
> utilities,
> > > you don't even need to split the input stream into lines.
> >
> > I'm not following why those now wouldn't be needed.  Redfish LogEntry
> > would separate per line, so we'd have to do the splitting somewhere.
> > There's already code to do that in hostlogger.  Wouldn't you still
> > want to separate log per boot, and have lines split between log files?
> >  I'm not following why those would go away just because there's a
> > desire to poll for logs and get up to date information.
> >
> > > Just redirect obmc-console.log: `tail -f /var/log/obmc-console.log >
> my.log`.
> >
> > This doesn't solve the problem presented.  First of all, log rotation
> > and compression are still needed.  Also, it's desirable to have dbus
> > watchers and separate the logs per boot, such that they can end up
> > separated in the Redfish API.  Also, in what you presented, my.log
> > would quickly and easily overflow the available space, as there's no
> > log rotation.
> >
> > >
> > > I understand your desire to add a new mode to the existing project
> instead of
> > > creating a new one. But there is very little in common between these
> modes.
> >
> > I'm not following how they're all that different, see above about
> > needing very similar features.  For the sake of argument, lets say we
> > went with a totally different implementation, would it be able to live
> > in the hostlogger repo to be able to reuse code where needed?  There's
> > a lot of code that I suspect will be identical.
> >
> > > Even reading the socket will have to be done separately, since it is
> buffered
> > > for line splitting in the current implementation.
> > > In the end, only bb-recipe and the `main` function will remain in the
> common.
> > >
> > > --
> > > Regards,
> > > Artem Senichev
> > > Software Engineer, YADRO.
> > >
> > > On Fri, May 21, 2021 at 10:51:45AM -0700, Nan Zhou wrote:
> > >
> > > > >
> > > > > >
> > > > > > > we propose to remove the ring buffer and write to the log file
> > > > > > > as soon as some characters are received.
> > > > > >
> > > > > > I am not sure it is a good idea.
> > > > > > The host can generate a lot of logs, but we have very limited
> free space.
> > > > > This is a fair concern, but wouldn't the rollover limits make this
> not
> > > > > an issue?  They seem like they could be easily configured.
> > > >
> > > > Right. Logrotate will be able to handle the rotation. Maximum size,
> # log
> > > > files, and compression can be easily configured.
> > > >
> > > > > In
> > > > > > addition, such heavy traffic will lead to a quick breakdown of
> the flash
> > > > > (most
> > > > > > available products are guaranteed to withstand around 100,000
> P/E cycles
> > > > > only).
> > > > > JFFS2 is wear leveled, and there are other BMC stacks that I know
> of
> > > > > that implement this without any ill effects to flash longevity,
> with
> > > > > that said, if Nan made the "last log on disk" feature configurable,
> > > > > would that alleviate your concerns?
> > > >
> > > > We also noticed that the obmc-server itself will buffer the log a
> bit. Will
> > > > it still be a problem if we don't write a character at once but a
> block of
> > > > them?
> > > > And as Ed said, we can also make this feature configurable. I would
> imagine
> > > > the log buffer will remain if the "last log on disk" feature is
> disabled.
> > > >
> > > >
> > > > > >
> > > > > > > This implicitly removes the needs
> > > > > > > of the ring buffer, and the persistence triggering (host
> reboot,
> > > > > sigterm,
> > > > > > > etc) in hostlogger. We believe this keeps the same
> functionality but
> > > > > saves
> > > > > > > hundreds of lines of codes in phosphor-hostlogger.
> > > > > Difference of opinion here, I don't think this removes the need for
> > > > > the host reboot event;  Having each reboot post to a different log
> > > > > needs to be maintained, and I have to imagine that there's some
> sort
> > > > > of sigterm handler still, although it becomes a lot smaller.
> > > >
> > > >
> > > > >
> > > > > > You are suggesting to delete the buffer, DBus watcher, log
> rotate. How
> > > > > are you
> > > > > > going to keep the same functionality if you remove everything
> related to
> > > > > it?
> > > > > +1.  In the initial thought I didn't think we were removing any
> > > > > functionality with this.  I had assumed the dbus watcher would
> remain,
> > > > > and we would still have the log rotation behavior.  In reading
> through
> > > > > Nans proposal I don't think these are getting removed;  Maybe I
> > > > > misunderstood?
> > > >
> > > >
> > > > Yes, if we want to keep different reboot posts to a different log
> file, we
> > > > can keep part of the dbus/signal watcher.
> > > >
> > > > On Fri, May 21, 2021 at 10:24 AM Ed Tanous <edtanous at google.com>
> wrote:
> > > >
> > > > > On Thu, May 20, 2021 at 11:10 PM Artem Senichev <
> artemsen at gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > On Thu, May 20, 2021 at 04:29:09PM -0700, Nan Zhou wrote:
> > > > > > > Hi all,
> > > > > > >
> > > > > > > In the previous thread (
> > > > > > >
> https://lists.ozlabs.org/pipermail/openbmc/2021-March/025234.html), we
> > > > > > > (engineers from Google and Quanta) discussed our attempt to
> share host
> > > > > > > serial logs via Redfish, which includes polling logs via
> LogService and
> > > > > > > streaming log bytes via EventService (e.g. SSE). We went with
> the
> > > > > event log
> > > > > > > architecture
> > > > > > > <
> > > > >
> https://github.com/openbmc/docs/blob/master/architecture/redfish-logging-in-bmcweb.md
> > > > > >
> > > > > > > and did the implementation.
> > > > > > >
> > > > > > > We still want to reuse the phosphor-hostlogger and do some
> > > > > modification. We
> > > > > > > believe it is better to try to reuse existing codes if
> possible and
> > > > > improve
> > > > > > > them rather than creating new things that have similar
> functionalities
> > > > > (in
> > > > > > > our case, it is a daemon that could collect logs and persist
> them).
> > > > > >
> > > > > > I agree, reusing code is a right choice, but only when it is
> really
> > > > > possible.
> > > > > > For now it looks like you want to remove most of the Hostlogger
> features
> > > > > to
> > > > > > transform it from buffer-like to stream-like service.
> > > > >
> > > > > I'm not quite following this statement.  Which features are getting
> > > > > removed?  From what I can see, he's suggesting making
> > > > > phosphor-hostlogger look more like a well-behaved linux logging
> > > > > daemon, but I don't think any features got omitted (or I'm missing
> > > > > something critical).
> > > > >
> > > > > >
> > > > > > > We want to do the following modification in
> phosphor-hostlogger (from
> > > > > the
> > > > > > > input and output point of view, just very few things will be
> changed)
> > > > > > >
> > > > > > > 1. One limitation of phosphor-hostlogger is that it will lose
> data in
> > > > > the
> > > > > > > memory (the ring buffer maintained by hostlogger) when BMC
> gets force
> > > > > > > restarted;
> > > > > >
> > > > > > When BMC goes to reboot it stops all services, at that moment
> hostlogger
> > > > > gets
> > > > > > a signal and flushes all gathered logs to a file.
> > > > >
> > > > > Only if the reboot is planned.  If the BMC loses power (which is
> > > > > "normal" for a bmc) there isn't time to persist to flash before the
> > > > > power goes down and the logs are most likely lost.
> > > > >
> > > > > >
> > > > > > > we propose to remove the ring buffer and write to the log file
> > > > > > > as soon as some characters are received.
> > > > > >
> > > > > > I am not sure it is a good idea.
> > > > > > The host can generate a lot of logs, but we have very limited
> free space.
> > > > >
> > > > > This is a fair concern, but wouldn't the rollover limits make this
> not
> > > > > an issue?  They seem like they could be easily configured.
> > > > >
> > > > > > In
> > > > > > addition, such heavy traffic will lead to a quick breakdown of
> the flash
> > > > > (most
> > > > > > available products are guaranteed to withstand around 100,000
> P/E cycles
> > > > > only).
> > > > >
> > > > > JFFS2 is wear leveled, and there are other BMC stacks that I know
> of
> > > > > that implement this without any ill effects to flash longevity,
> with
> > > > > that said, if Nan made the "last log on disk" feature configurable,
> > > > > would that alleviate your concerns?
> > > > >
> > > > > >
> > > > > > > This implicitly removes the needs
> > > > > > > of the ring buffer, and the persistence triggering (host
> reboot,
> > > > > sigterm,
> > > > > > > etc) in hostlogger. We believe this keeps the same
> functionality but
> > > > > saves
> > > > > > > hundreds of lines of codes in phosphor-hostlogger.
> > > > >
> > > > > Difference of opinion here, I don't think this removes the need for
> > > > > the host reboot event;  Having each reboot post to a different log
> > > > > needs to be maintained, and I have to imagine that there's some
> sort
> > > > > of sigterm handler still, although it becomes a lot smaller.
> > > > >
> > > > > >
> > > > > > You are suggesting to delete the buffer, DBus watcher, log
> rotate. How
> > > > > are you
> > > > > > going to keep the same functionality if you remove everything
> related to
> > > > > it?
> > > > >
> > > > > +1.  In the initial thought I didn't think we were removing any
> > > > > functionality with this.  I had assumed the dbus watcher would
> remain,
> > > > > and we would still have the log rotation behavior.  In reading
> through
> > > > > Nans proposal I don't think these are getting removed;  Maybe I
> > > > > misunderstood?
> > > > >
> > > > > >
> > > > > > > 2. We propose not to compress the latest log file. This saves
> us the
> > > > > > > overhead of doing decompression when BMCWeb just needs to
> retrieve the
> > > > > most
> > > > > > > recent logs. There are still going to be log rotations in the
> file
> > > > > level.
> > > > > > > Files other than the latest log file are still going to be
> compressed.
> > > > > We
> > > > > > > can modify existing codes to achieve this or use the linux
> logrotate
> > > > > > > directly.
> > > > > > >
> > > > > > > Furthermore, we will add host serial logs into BMCWeb, both
> LogService
> > > > > and
> > > > > > > EventService. In LogService, we will teach BMCWeb how to read
> the
> > > > > latest
> > > > > > > log file that is not compressed and the other compressed old
> logs, and
> > > > > how
> > > > > > > to assemble LogEntries out of raw serial logs. We will discuss
> > > > > EventService
> > > > > > > in future threads but the very initial idea is to setup
> inotify on log
> > > > > > > files and SSE to stream out new bytes to clients (like what the
> > > > > existing
> > > > > > > event logging is doing).
> > > > > > >
> > > > > > > As we said above, for phosphor-hostlogger, the input is still
> the
> > > > > > > obmc-server unix socket, and the output are still persisted
> log files.
> > > > > But
> > > > > > > the functionality will get improved (less data loss), code gets
> > > > > simplified
> > > > > > > (no ring buffer and persistence triggering), and it will
> become easier
> > > > > and
> > > > > > > more performant to get BMCWeb hooked up for log streaming via
> Redfish.
> > > > > > >
> > > > > > > Please let us know what you think. We appreciate any feedback.
> Thank
> > > > > you
> > > > > > > very much!
> > > > > > >
> > > > > > > Sincerely,
> > > > > > > Nan
> > > > > >
> > > > > > --
> > > > > > Regards,
> > > > > > Artem Senichev
> > > > > > Software Engineer, YADRO.
> > > > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20210525/59a282bc/attachment-0001.htm>


More information about the openbmc mailing list