IPMI SEL refactoring comments
Tom Joseph
tomjose at linux.vnet.ibm.com
Thu Jan 24 03:11:48 AEDT 2019
Hello Jason,
On Thursday 29 November 2018 04:51 AM, Bills, Jason M wrote:
>
>
> On 11/26/2018 5:26 AM, Tom Joseph wrote:
>> Hello Jason,
>>
>> I reviewed the patches for the IPMI SEL refactoring and wanted to
>> summarize the concerns with the proposed design. There are some
>> current features that are broken.
>>
>> 1) The sensor numbers are arbitrary in the proposed implementation
>> and does not account for the presence/error corresponding to
>> inventory objects. On IBM systems the host firmware does not rely on
>> the SDR repository in BMC to figure out the sensor numbers, rather
>> the sensor numbers are programmed into the host firmware at the
>> build time. So the sensor numbers cannot be arbitrary.
>>
> I think this is the main issue: mapping sensor paths to sensor numbers
> in a flexible way. Once we can find a way to allow custom sensor
> number mapping, it should allow all types of sensors to be mapped as
> needed.
>
> One idea we've had would be to somehow add the sensor number into the
> sensor D-Bus object which would mitigate arbitrary numbering as needed.
>
>> > I know you are working on a proposal to have a customizable map of
>> sensor number to path. The support is only for threshold based
>> sensors( targeting only /xyz/openbmc_project/sensors namespace) , it
>> needs to extend support for discrete sensors. The sensor types are
>> limited to few types(temperature, voltage,current, fan_tach, power),
>> need to support other sensor types in the specification. It might not
>> be possible to deduce the sensor type from the D-Bus object path for
>> all sensors.
>>
>> 2) OpenPower systems like Witherspoon and Zaius relies on
>> eSEL(extended SEL format) to report debug data from the host
>> firmware. The eSEL data is added to the D-Bus native error log via
>> the Add SEL entry command. This is not handled with the refactoring.
>>
>> > This is an openpower requirement and needs to be handled in an
>> openpower IPMI OEM library/application and not in the phosphor
>> implementation. A D-Bus error log containing eSEL can be created on
>> completion of transfer of eSEL and not depend on Add SEL entry. The
>> callouts associated with the eSEL can be added to the D-Bus error log
>> by keeping a watch on the IPMI SEL entries. This might need changes
>> to phosphor-logging API to support adding callout after creation of
>> error log. This is a proposal and looks viable.
I made progress with moving the openpower specific code from
phosphor-host-ipmid to openpower IPMI OEM library.
>>
>> 3) The current implementation the Get SEL entry is translating the
>> D-Bus error log to IPMI SEL format. If there is a FRU callout
>> associated with the D-Bus error log(D-Bus object paths in the
>> inventory namespace) it is mapped to the sensor number associated
>> with FRU and reported through SEL.
>>
>> > The implementation of the Get SEL entry needs to adapt to include
>> all the objects paths(the patchsets target only
>> /xyz/openbmc_project/sensors namespace) and map the D-Bus object path
>> to the sensor numbers(based on the proposed customizable map).
>>
> This should also be solved by fixing the sensor number mapping.
>
>> 4) In the current implementation all the D-Bus error logs have 1x1
>> mapping in the IPMI SEL. In the proposed solution the users will have
>> to rely on both D-Bus error logs and IPMI SEL to get the complete
>> picture. The customers cannot rely only on IPMI SEL as an event
>> repository.
>>
>> > In the current proposal D-Bus error logs reported by BMC will not
>> be shown by SEL commands. I am okie with Jason's proposal to set up a
>> D-Bus match (either by ipmid or phosphor-sel-logger) that will watch
>> for new logging D-Bus objects to be created and add a SEL record for
>> them on creation.
>>
> Another thought would be to abandon D-Bus-based logging and see if the
> same information could be placed in the journal.
>
>> https://gerrit.openbmc-project.xyz/#/q/status:open++topic:%22IPMI+SEL%22
>>
>> Regards,
>> Tom
>>
>
> More work is needed to come up with a flexible generic sensor
> numbering scheme that will work with the various sensor
> implementations. I have been assigned to a different task and will
> not be working on this for a while. So, as discussed in the call this
> week, as a stop-gap, I plan to move the journal-based SEL
> implementation into intel-ipmi-oem for now. We can revisit this as we
> make progress on the sensor numbering issue. This will also allow
> everyone to see the full working implementation which may help resolve
> some questions.
Did you get to make progress with the flexible generic sensor numbering
scheme? We will be glad to have your journal based SEL implementation
upstreamed.
>
> Thanks,
> -Jason
>
More information about the openbmc
mailing list