Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.
Thu Nguyen
thu at amperemail.onmicrosoft.com
Wed Dec 16 18:33:04 AEDT 2020
Hi All,
I'm working with Fan sensors on Ampere MtJade platform.
In this platform, I have multiple fans which name as FAN3_1, FAN3_2,
FAN4_1, FAN4_2, FAN5_1...
I added the configuration for those fans in phosphor-hwmon and I also
added option "--enable-update-functional-on-fail" in phosphor-hwmon
build flag. I'm trying to set fan functional to false when unplug fan.
Flash new image to the board, read functional of fans. The time to read
dbus property is about 0.05->0.1 seconds:
root at mtjade:~# time busctl get-property
xyz.openbmc_project.Hwmon-1644477290.Hwmon1
/xyz/openbmc_project/sensors/fan_tach/FAN4_2
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true
real 0m0.078s
user 0m0.002s
sys 0m0.032s
root at mtjade:~# time busctl get-property
xyz.openbmc_project.Hwmon-1644477290.Hwmon1
/xyz/openbmc_project/sensors/fan_tach/FAN3_2
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true
real 0m0.044s
user 0m0.001s
sys 0m0.034s
After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2
is set to false as expected. And functional of others fans keeps true.
But the time to get dbus properties of all fans have a huge increasement
event in the working fans.
~# time busctl get-property xyz.openbmc_project.Hwmon-1644477290.Hwmon1
/xyz/openbmc_project/sensors/fan_tach/FAN4_2
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b false
real 0m1.189s
user 0m0.001s
sys 0m0.036s
~# time busctl get-property xyz.openbmc_project.Hwmon-1644477290.Hwmon1
/xyz/openbmc_project/sensors/fan_tach/FAN3_2
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true
real 0m3.285s
user 0m0.010s
sys 0m0.028s
The "ipmitool sdr type 0x4" commands is also failed because this
increasement.
~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr type 0x4
FAN3_1 | 25h | ok | 29.13 | 5100 RPM
FAN3_2 | 28h | ok | 29.16 | 4700 RPM
FAN4_1 | 2Bh | ns | 29.19 | No Reading
FAN4_2 | 2Eh | ns | 29.22 | No Reading
FAN5_1 | 31h | ns | 29.25 | No Reading
FAN5_2 | 34h | ns | 29.28 | No Reading
FAN6_1 | 37h | ns | 29.31 | No Reading
FAN6_2 | 3Ah | ns | 29.34 | No Reading
FAN7_1 | 3Dh | ns | 29.37 | No Reading
FAN7_2 | 40h | ns | 29.40 | No Reading
FAN8_1 | 43h | ns | 29.43 | No Reading
FAN8_2 | 46h | ns | 29.46 | No Reading
PSU0_fan1 | F5h | ns | 29.60 | No Reading
PSU1_fan1 | F6h | ns | 29.61 | No Reading
real 2m43.704s
user 0m0.046s
sys 0m0.057s
The cause of this increasement is when it failed to read one sensor
phosphor-hwmon keep trying to read the sensors with the retry is 10 and
the 100ms delays between retry times.
Should we reduce the retry for non-functional sensors?
Regards.
Thu Nguyen.
More information about the openbmc
mailing list