<div dir="ltr">Hi James,<div><br></div><div>I think BiosPist power state might not suffice, because the host needs to load firmware onto the device in order to enable the sensors at a certain stage in the OS boot, which is very close to boot completion.</div><div><br></div><div>However, we can tolerate the fan being noisy before boot completion, and I believe the root cause the issue is the HwmonTempSensor freezes once the control flow hitting boost::asio::async_read_until (<a href="https://github.com/openbmc/dbus-sensors/blob/master/src/HwmonTempSensor.cpp#L92">https://github.com/openbmc/dbus-sensors/blob/master/src/HwmonTempSensor.cpp#L92</a>). Do you know if this function has something special to do with a file that can have errno EAGAIN? Based on that, replacing the errno in the driver with sth other than EAGAIN also seems to be a viable fix.</div><div><br></div><div>Thanks!</div><div><br>- Alex Qiu<br><br></div></div><br clear="all"><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><br></div>- Alex Qiu</div></div></div><br><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Aug 28, 2020 at 10:54 AM James Feist <<a href="mailto:james.feist@linux.intel.com">james.feist@linux.intel.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 8/28/2020 9:43 AM, Alex Qiu wrote:<br>
> Hi James,<br>
> <br>
> Thx for the reply! So right now, one thing is that the sensor is not <br>
> dependent on the power state of the host solely, but also dependent on <br>
> the boot progress of the host.<br>
<br>
Would the BiosPost power state not suffice?<br>
<br>
> And the more serious issue is that <br>
> returning EAGAIN from the driver freezes the sensor, which is what I'm <br>
> debugging right now. Do we have special treatment on errno returned by <br>
> the driver? Thx.<br>
<br>
I ran into a similar issue with the CPUSensor and this was my fix:<br>
<a href="https://github.com/openbmc/dbus-sensors/commit/c22b842bfa8cfe798d83f99fa7aa9f142278c21d#diff-ccbe0562fe1d501b4c1c42d967a02ea0" rel="noreferrer" target="_blank">https://github.com/openbmc/dbus-sensors/commit/c22b842bfa8cfe798d83f99fa7aa9f142278c21d#diff-ccbe0562fe1d501b4c1c42d967a02ea0</a><br>
<br>
I haven't hit this issue with hwmon sensor though.<br>
<br>
> <br>
> - Alex Qiu<br>
> <br>
> <br>
> On Fri, Aug 28, 2020 at 9:38 AM James Feist <<a href="mailto:james.feist@linux.intel.com" target="_blank">james.feist@linux.intel.com</a> <br>
> <mailto:<a href="mailto:james.feist@linux.intel.com" target="_blank">james.feist@linux.intel.com</a>>> wrote:<br>
> <br>
> On 8/27/2020 2:49 PM, Alex Qiu wrote:<br>
> > Hi James,<br>
> ><br>
> > After some debugging, I realized that the code I pointed out earlier<br>
> > wasn't the root cause. Update is that, the HwmonTempSensor stops<br>
> > updating after the hwmon driver returns EAGAIN as errno. I'll keep<br>
> > debugging...<br>
> ><br>
> > - Alex Qiu<br>
> ><br>
> ><br>
> > On Tue, Aug 25, 2020 at 5:49 PM Alex Qiu <<a href="mailto:xqiu@google.com" target="_blank">xqiu@google.com</a><br>
> <mailto:<a href="mailto:xqiu@google.com" target="_blank">xqiu@google.com</a>><br>
> > <mailto:<a href="mailto:xqiu@google.com" target="_blank">xqiu@google.com</a> <mailto:<a href="mailto:xqiu@google.com" target="_blank">xqiu@google.com</a>>>> wrote:<br>
> ><br>
> > Hi James and OpenBMC community,<br>
> ><br>
> > We have a sensor for HwmonTempSensor which doesn't have a valid<br>
> > reading until the host is fully booted. Before it's becoming<br>
> alive<br>
> > and useful, it's getting disabled in code<br>
> > <br>
> (<a href="https://github.com/openbmc/dbus-sensors/blob/master/include/sensor.hpp#L266" rel="noreferrer" target="_blank">https://github.com/openbmc/dbus-sensors/blob/master/include/sensor.hpp#L266</a>)<br>
> > because of errors thrown up by the hwmon driver. Ideally, the<br>
> > thermal control loop should kick the fan to fail safe mode<br>
> until no<br>
> > more errors are observed.<br>
> ><br>
> > Any suggestions on how we should handle this kind of sensor<br>
> properly?<br>
> <br>
> For what its worth we use the PowerState property that has options of<br>
> power on or BiosPost to disable scanning when the state is invalid:<br>
> <a href="https://github.com/openbmc/dbus-sensors/blob/f27a55c775383a3fb1ac655f3eda785f6845f214/src/HwmonTempMain.cpp#L208" rel="noreferrer" target="_blank">https://github.com/openbmc/dbus-sensors/blob/f27a55c775383a3fb1ac655f3eda785f6845f214/src/HwmonTempMain.cpp#L208</a><br>
> <br>
> <br>
> ><br>
> > Thank you!<br>
> ><br>
> > - Alex Qiu<br>
> ><br>
> <br>
</blockquote></div>