Dealing with a sensor which doesn't have valid reading until host is powered up

Alex Qiu xqiu at google.com
Tue Sep 1 07:32:44 AEST 2020


Hi James,

I think BiosPist power state might not suffice, because the host needs to
load firmware onto the device in order to enable the sensors at a certain
stage in the OS boot, which is very close to boot completion.

However, we can tolerate the fan being noisy before boot completion, and I
believe the root cause the issue is the HwmonTempSensor freezes once the
control flow hitting boost::asio::async_read_until (
https://github.com/openbmc/dbus-sensors/blob/master/src/HwmonTempSensor.cpp#L92).
Do you know if this function has something special to do with a file that
can have errno EAGAIN? Based on that, replacing the errno in the driver
with sth other than EAGAIN also seems to be a viable fix.

Thanks!

- Alex Qiu



- Alex Qiu


On Fri, Aug 28, 2020 at 10:54 AM James Feist <james.feist at linux.intel.com>
wrote:

> On 8/28/2020 9:43 AM, Alex Qiu wrote:
> > Hi James,
> >
> > Thx for the reply! So right now, one thing is that the sensor is not
> > dependent on the power state of the host solely, but also dependent on
> > the boot progress of the host.
>
> Would the BiosPost power state not suffice?
>
> > And the more serious issue is that
> > returning EAGAIN from the driver freezes the sensor, which is what I'm
> > debugging right now. Do we have special treatment on errno returned by
> > the driver? Thx.
>
> I ran into a similar issue with the CPUSensor and this was my fix:
>
> https://github.com/openbmc/dbus-sensors/commit/c22b842bfa8cfe798d83f99fa7aa9f142278c21d#diff-ccbe0562fe1d501b4c1c42d967a02ea0
>
> I haven't hit this issue with hwmon sensor though.
>
> >
> > - Alex Qiu
> >
> >
> > On Fri, Aug 28, 2020 at 9:38 AM James Feist <james.feist at linux.intel.com
> > <mailto:james.feist at linux.intel.com>> wrote:
> >
> >     On 8/27/2020 2:49 PM, Alex Qiu wrote:
> >      > Hi James,
> >      >
> >      > After some debugging, I realized that the code I pointed out
> earlier
> >      > wasn't the root cause. Update is that, the HwmonTempSensor stops
> >      > updating after the hwmon driver returns EAGAIN as errno. I'll keep
> >      > debugging...
> >      >
> >      > - Alex Qiu
> >      >
> >      >
> >      > On Tue, Aug 25, 2020 at 5:49 PM Alex Qiu <xqiu at google.com
> >     <mailto:xqiu at google.com>
> >      > <mailto:xqiu at google.com <mailto:xqiu at google.com>>> wrote:
> >      >
> >      >     Hi James and OpenBMC community,
> >      >
> >      >     We have a sensor for HwmonTempSensor which doesn't have a
> valid
> >      >     reading until the host is fully booted. Before it's becoming
> >     alive
> >      >     and useful, it's getting disabled in code
> >      >
> >       (
> https://github.com/openbmc/dbus-sensors/blob/master/include/sensor.hpp#L266
> )
> >      >     because of errors thrown up by the hwmon driver. Ideally, the
> >      >     thermal control loop should kick the fan to fail safe mode
> >     until no
> >      >     more errors are observed.
> >      >
> >      >     Any suggestions on how we should handle this kind of sensor
> >     properly?
> >
> >     For what its worth we use the PowerState property that has options of
> >     power on or BiosPost to disable scanning when the state is invalid:
> >
> https://github.com/openbmc/dbus-sensors/blob/f27a55c775383a3fb1ac655f3eda785f6845f214/src/HwmonTempMain.cpp#L208
> >
> >
> >      >
> >      >     Thank you!
> >      >
> >      >     - Alex Qiu
> >      >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20200831/522b58f0/attachment.htm>


More information about the openbmc mailing list