RFC for event logging mechanism

Tue Sep 5 21:24:01 AEST 2017

Hello,

I'm working this sprint on designing an event logging mechanism 
(https://github.com/openbmc/openbmc/issues/1856). I have a couple 
proposals below along with some questions. Hoping to hear thoughts on 
which might be a better proposal. Any other feedback is welcome.

Potential requirements
1) Applications should be able to log events of interest. Events could 
be used for purposes such as telemetry, analytics, debug. Examples of 
events could be changes in the power/thermal domain, such as operating 
temps on a server, boot related, user account changes, etc.
2a) Users should be able to query events via REST.
2b) Users should also be able to query events of a certain category or type.
3) Users should also be able to "download" events in a format such as 
JSON (This comes for free today with the rest server running on OpenBMC).
4) It should be possible to specify event metadata, which may have use 
for a human as well as a program.
5) It should be possible to persist events up to a certain cap.

Proposal 1 - Leverage existing OpenBMC phosphor-logging

Phopshor-logging works as a supplement to journald - at a high level it 
makes it possible to log errors to the journal, as well as create d-bus 
objects representing the errors.

- Phosphor-logging uses the Entry interface [1] to describe an error. I 
have [2] as the proposed Event interface. It's mostly similar to [1] - 
differences being - I wasn't sure if we really need event severity and 
resolution, plus having an event Category would be handy for handling 
Requirement 2b).

- Phosphor-logging requires describing errors in yaml (error yaml and 
error metadata yaml), which are processed [3] by a script that generates 
an error log API, which clients can use. The API is part of a 
phosphor-logging client lib. The same yaml structure can be utilized for 
events, maybe with the yaml files themselves being named slightly 
differently to depict events and event metadata instead of errors. This 
means the client lib will have an event API, similar to the existing 
elog API [6]. Error yaml files are stored either in the 
phosphor-dbus-interfaces repo, or within an application's repo, based on 
whether the error corresponds to a d-bus interface failure or not. In 
case of events, I think the event yaml files can just be stored in the 
app that creates them.

- The event logging API, in addition to logging to journal, will call an 
internal phosphor-logging d-bus API, similar to [4], in order create a 
d-bus object depicting the event. Based on the event Category, the d-bus 
object will be placed in the right namespace, such as 
/xyz/openbmc_project/logging/events/boot/ or 
/xyz/openbmc_project/logging/events/thermal/. The phosphor-logging 
process, hence, will own these d-bus objects, do the id management (per 
category), etc.

Proposal 2 - Write d-bus interfaces to describe events

Couple of issues I see with Proposal 1 :

a) It's cumbersome for a BMC app to figure out that a specific event was 
reported, or to express interest in a certain category of events. The 
d-bus path namespace can help to a certain extent here though, but it's 
based on paths and properties and not interfaces being added.
b) Both the existing Entry interface [1] and the proposed Event 
interface [2] express metadata as strings, probably not the most elegant 
way for an interested program to deal with them.

Given this, it feels more natural to express an event in it's own d-bus 
interface, such as an Event.Boot or Event.Thermal interface. So, this 
proposal looks like :

- Define an Event log interface [5]. Note that this is mostly like [2], 
although it has an additional method to create the event d-bus object.

- For specific event types, define their own d-bus interfaces. I don't 
have examples for these at the moment, but like I mentioned above, we 
could have interfaces for Event.Boot and Event.Thermal to start with. 
These interfaces could be placed in the phosphor-dbus-interfaces repo. A 
phosphor-logging application will have the code to implement these 
well-known event interfaces, and to basically create d-bus objects. This 
app will also implement the "Notify" method defined in [5].

- An application interested in reporting an event will make a call to 
the "Notify" API defined in [5], stating the event category and the 
event metadata. The phosphor-logging application that implements 
"Notify", will create d-bus objects based on the event Category and 
metadata, and place them in appropriate d-bus path namespaces, similar 
to Proposal 1. It can also log the event information to the journal, 
though I am not sure why this would be required, aside from the having 
the need to have the journal as the repo of all events.

[1] 
https://github.com/openbmc/phosphor-dbus-interfaces/blob/master/xyz/openbmc_project/Logging/Entry.interface.yaml
[2] https://gerrit.openbmc-project.xyz/#/c/6405/1
[3] 
https://github.com/openbmc/phosphor-logging/blob/master/tools/elog-gen.py, 
error yaml example : 
https://github.com/openbmc/phosphor-dbus-interfaces/tree/master/xyz/openbmc_project/Dump
[4] 
https://github.com/openbmc/phosphor-logging/blob/master/log_manager.cpp#L27
[5] https://gerrit.openbmc-project.xyz/#/c/6406/1
[6] 
https://github.com/openbmc/phosphor-logging/blob/master/phosphor-logging/elog.hpp#L126

Regards,
Deepak