Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.

Thu Nguyen thu at amperemail.onmicrosoft.com
Wed Dec 16 18:33:04 AEDT 2020


Hi All,


I'm working with Fan sensors on Ampere MtJade platform.

In this platform, I have multiple fans which name as FAN3_1, FAN3_2, 
FAN4_1, FAN4_2, FAN5_1...

I added the configuration for those fans in phosphor-hwmon and I also 
added option "--enable-update-functional-on-fail" in phosphor-hwmon 
build flag. I'm trying to set fan functional to false when unplug fan.

Flash new image to the board, read functional of fans. The time to read 
dbus property is about 0.05->0.1 seconds:

root at mtjade:~# time busctl get-property 
xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
/xyz/openbmc_project/sensors/fan_tach/FAN4_2 
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true

real    0m0.078s
user    0m0.002s
sys    0m0.032s
root at mtjade:~# time busctl get-property 
xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
/xyz/openbmc_project/sensors/fan_tach/FAN3_2 
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true


real    0m0.044s
user    0m0.001s
sys    0m0.034s

After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2 
is set to false as expected. And functional of others fans keeps  true. 
But the time to get dbus properties of all fans have a huge increasement 
event in the working fans.

~# time busctl get-property xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
/xyz/openbmc_project/sensors/fan_tach/FAN4_2 
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b false

real    0m1.189s
user    0m0.001s
sys    0m0.036s

~# time busctl get-property xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
/xyz/openbmc_project/sensors/fan_tach/FAN3_2 
xyz.openbmc_project.State.Decorator.OperationalStatus Functional
b true

real    0m3.285s
user    0m0.010s
sys    0m0.028s

The "ipmitool sdr type 0x4" commands is also failed because this 
increasement.

~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr type 0x4
FAN3_1           | 25h | ok  | 29.13 | 5100 RPM
FAN3_2           | 28h | ok  | 29.16 | 4700 RPM
FAN4_1           | 2Bh | ns  | 29.19 | No Reading
FAN4_2           | 2Eh | ns  | 29.22 | No Reading
FAN5_1           | 31h | ns  | 29.25 | No Reading
FAN5_2           | 34h | ns  | 29.28 | No Reading
FAN6_1           | 37h | ns  | 29.31 | No Reading
FAN6_2           | 3Ah | ns  | 29.34 | No Reading
FAN7_1           | 3Dh | ns  | 29.37 | No Reading
FAN7_2           | 40h | ns  | 29.40 | No Reading
FAN8_1           | 43h | ns  | 29.43 | No Reading
FAN8_2           | 46h | ns  | 29.46 | No Reading
PSU0_fan1        | F5h | ns  | 29.60 | No Reading
PSU1_fan1        | F6h | ns  | 29.61 | No Reading

real    2m43.704s
user    0m0.046s
sys    0m0.057s

The cause of this increasement is when it failed to read one sensor 
phosphor-hwmon keep trying to read the sensors with the retry is 10 and 
the 100ms delays between retry times.

Should we reduce the retry for non-functional sensors?


Regards.

Thu Nguyen.








More information about the openbmc mailing list