Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
Lei Yu
yulei.sh at bytedance.com
Thu Dec 24 12:52:29 AEDT 2020
On Wed, Dec 23, 2020 at 11:33 PM Thu Nguyen
<thu at amperemail.onmicrosoft.com> wrote:
>
> On 12/16/20 14:33, Thu Nguyen wrote:
> > Hi All,
> >
> >
> > I'm working with Fan sensors on Ampere MtJade platform.
> >
> > In this platform, I have multiple fans which name as FAN3_1, FAN3_2,
> > FAN4_1, FAN4_2, FAN5_1...
> >
> > I added the configuration for those fans in phosphor-hwmon and I also
> > added option "--enable-update-functional-on-fail" in phosphor-hwmon
> > build flag. I'm trying to set fan functional to false when unplug fan.
> >
> > Flash new image to the board, read functional of fans. The time to
> > read dbus property is about 0.05->0.1 seconds:
> >
> > root at mtjade:~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN4_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> > real 0m0.078s
> > user 0m0.002s
> > sys 0m0.032s
> > root at mtjade:~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN3_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> >
> > real 0m0.044s
> > user 0m0.001s
> > sys 0m0.034s
> >
> > After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2
> > is set to false as expected. And functional of others fans keeps
> > true. But the time to get dbus properties of all fans have a huge
> > increasement event in the working fans.
> >
> > ~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN4_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b false
> >
> > real 0m1.189s
> > user 0m0.001s
> > sys 0m0.036s
> >
> > ~# time busctl get-property
> > xyz.openbmc_project.Hwmon-1644477290.Hwmon1
> > /xyz/openbmc_project/sensors/fan_tach/FAN3_2
> > xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> > b true
> >
> > real 0m3.285s
> > user 0m0.010s
> > sys 0m0.028s
> >
> > The "ipmitool sdr type 0x4" commands is also failed because this
> > increasement.
> >
> > ~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr
> > type 0x4
> > FAN3_1 | 25h | ok | 29.13 | 5100 RPM
> > FAN3_2 | 28h | ok | 29.16 | 4700 RPM
> > FAN4_1 | 2Bh | ns | 29.19 | No Reading
> > FAN4_2 | 2Eh | ns | 29.22 | No Reading
> > FAN5_1 | 31h | ns | 29.25 | No Reading
> > FAN5_2 | 34h | ns | 29.28 | No Reading
> > FAN6_1 | 37h | ns | 29.31 | No Reading
> > FAN6_2 | 3Ah | ns | 29.34 | No Reading
> > FAN7_1 | 3Dh | ns | 29.37 | No Reading
> > FAN7_2 | 40h | ns | 29.40 | No Reading
> > FAN8_1 | 43h | ns | 29.43 | No Reading
> > FAN8_2 | 46h | ns | 29.46 | No Reading
> > PSU0_fan1 | F5h | ns | 29.60 | No Reading
> > PSU1_fan1 | F6h | ns | 29.61 | No Reading
> >
> > real 2m43.704s
> > user 0m0.046s
> > sys 0m0.057s
> >
> > The cause of this increasement is when it failed to read one sensor
> > phosphor-hwmon keep trying to read the sensors with the retry is 10
> > and the 100ms delays between retry times.
> >
> > Should we reduce the retry for non-functional sensors?
When a fan is unplugged, its "Present" property should be false as well.
Maybe you could check that property and skip such fans?
> >
> >
> > Regards.
> >
> > Thu Nguyen
> Hi All,
>
> Any feed back on this?
>
> Thu Nguyen,
>
--
BRs,
Lei YU
More information about the openbmc
mailing list