Dealing with a sensor which doesn't have valid reading until host is powered up

Alex Qiu xqiu at google.com
Tue Sep 1 08:08:50 AEST 2020


Hi James,

I just came through this doc (
https://www.boost.org/doc/libs/1_74_0/doc/html/boost_asio/overview/posix/stream_descriptor.html).
Looks like that it's a terrible idea for hwmon driver to return EAGAIN for
dbus-sensors. With that, I think the proper fix is also to use other errno
instead in our driver, and this caveat should be probably documented
somewhere.

Hi Guenter,

Is it reasonable for hwmon drivers to return EAGAIN? Is it something that
has special meaning and should be avoided in hwmon drivers?

Thank you!

- Alex Qiu


On Mon, Aug 31, 2020 at 2:32 PM Alex Qiu <xqiu at google.com> wrote:

> Hi James,
>
> I think BiosPist power state might not suffice, because the host needs to
> load firmware onto the device in order to enable the sensors at a certain
> stage in the OS boot, which is very close to boot completion.
>
> However, we can tolerate the fan being noisy before boot completion, and I
> believe the root cause the issue is the HwmonTempSensor freezes once the
> control flow hitting boost::asio::async_read_until (
> https://github.com/openbmc/dbus-sensors/blob/master/src/HwmonTempSensor.cpp#L92).
> Do you know if this function has something special to do with a file that
> can have errno EAGAIN? Based on that, replacing the errno in the driver
> with sth other than EAGAIN also seems to be a viable fix.
>
> Thanks!
>
> - Alex Qiu
>
>
>
> - Alex Qiu
>
>
> On Fri, Aug 28, 2020 at 10:54 AM James Feist <james.feist at linux.intel.com>
> wrote:
>
>> On 8/28/2020 9:43 AM, Alex Qiu wrote:
>> > Hi James,
>> >
>> > Thx for the reply! So right now, one thing is that the sensor is not
>> > dependent on the power state of the host solely, but also dependent on
>> > the boot progress of the host.
>>
>> Would the BiosPost power state not suffice?
>>
>> > And the more serious issue is that
>> > returning EAGAIN from the driver freezes the sensor, which is what I'm
>> > debugging right now. Do we have special treatment on errno returned by
>> > the driver? Thx.
>>
>> I ran into a similar issue with the CPUSensor and this was my fix:
>>
>> https://github.com/openbmc/dbus-sensors/commit/c22b842bfa8cfe798d83f99fa7aa9f142278c21d#diff-ccbe0562fe1d501b4c1c42d967a02ea0
>>
>> I haven't hit this issue with hwmon sensor though.
>>
>> >
>> > - Alex Qiu
>> >
>> >
>> > On Fri, Aug 28, 2020 at 9:38 AM James Feist <
>> james.feist at linux.intel.com
>> > <mailto:james.feist at linux.intel.com>> wrote:
>> >
>> >     On 8/27/2020 2:49 PM, Alex Qiu wrote:
>> >      > Hi James,
>> >      >
>> >      > After some debugging, I realized that the code I pointed out
>> earlier
>> >      > wasn't the root cause. Update is that, the HwmonTempSensor stops
>> >      > updating after the hwmon driver returns EAGAIN as errno. I'll
>> keep
>> >      > debugging...
>> >      >
>> >      > - Alex Qiu
>> >      >
>> >      >
>> >      > On Tue, Aug 25, 2020 at 5:49 PM Alex Qiu <xqiu at google.com
>> >     <mailto:xqiu at google.com>
>> >      > <mailto:xqiu at google.com <mailto:xqiu at google.com>>> wrote:
>> >      >
>> >      >     Hi James and OpenBMC community,
>> >      >
>> >      >     We have a sensor for HwmonTempSensor which doesn't have a
>> valid
>> >      >     reading until the host is fully booted. Before it's becoming
>> >     alive
>> >      >     and useful, it's getting disabled in code
>> >      >
>> >       (
>> https://github.com/openbmc/dbus-sensors/blob/master/include/sensor.hpp#L266
>> )
>> >      >     because of errors thrown up by the hwmon driver. Ideally, the
>> >      >     thermal control loop should kick the fan to fail safe mode
>> >     until no
>> >      >     more errors are observed.
>> >      >
>> >      >     Any suggestions on how we should handle this kind of sensor
>> >     properly?
>> >
>> >     For what its worth we use the PowerState property that has options
>> of
>> >     power on or BiosPost to disable scanning when the state is invalid:
>> >
>> https://github.com/openbmc/dbus-sensors/blob/f27a55c775383a3fb1ac655f3eda785f6845f214/src/HwmonTempMain.cpp#L208
>> >
>> >
>> >      >
>> >      >     Thank you!
>> >      >
>> >      >     - Alex Qiu
>> >      >
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20200831/4b27fbb9/attachment.htm>


More information about the openbmc mailing list