Phosphor-hwmon: reduce hwmonio::retries when sensor is Nonfunctional.

Thu Nguyen thu at amperemail.onmicrosoft.com
Thu Dec 24 02:32:29 AEDT 2020


On 12/16/20 14:33, Thu Nguyen wrote:
> Hi All,
>
>
> I'm working with Fan sensors on Ampere MtJade platform.
>
> In this platform, I have multiple fans which name as FAN3_1, FAN3_2, 
> FAN4_1, FAN4_2, FAN5_1...
>
> I added the configuration for those fans in phosphor-hwmon and I also 
> added option "--enable-update-functional-on-fail" in phosphor-hwmon 
> build flag. I'm trying to set fan functional to false when unplug fan.
>
> Flash new image to the board, read functional of fans. The time to 
> read dbus property is about 0.05->0.1 seconds:
>
> root at mtjade:~# time busctl get-property 
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
> /xyz/openbmc_project/sensors/fan_tach/FAN4_2 
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
> real    0m0.078s
> user    0m0.002s
> sys    0m0.032s
> root at mtjade:~# time busctl get-property 
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
> /xyz/openbmc_project/sensors/fan_tach/FAN3_2 
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
>
> real    0m0.044s
> user    0m0.001s
> sys    0m0.034s
>
> After unplug one fan (FAN4_2), I can see that fan functional of FAN4_2 
> is set to false as expected. And functional of others fans keeps  
> true. But the time to get dbus properties of all fans have a huge 
> increasement event in the working fans.
>
> ~# time busctl get-property 
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
> /xyz/openbmc_project/sensors/fan_tach/FAN4_2 
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b false
>
> real    0m1.189s
> user    0m0.001s
> sys    0m0.036s
>
> ~# time busctl get-property 
> xyz.openbmc_project.Hwmon-1644477290.Hwmon1 
> /xyz/openbmc_project/sensors/fan_tach/FAN3_2 
> xyz.openbmc_project.State.Decorator.OperationalStatus Functional
> b true
>
> real    0m3.285s
> user    0m0.010s
> sys    0m0.028s
>
> The "ipmitool sdr type 0x4" commands is also failed because this 
> increasement.
>
> ~$ time ipmitool -I lanplus -U root -P 0penBmc -C 17 -H <BMCIP> sdr 
> type 0x4
> FAN3_1           | 25h | ok  | 29.13 | 5100 RPM
> FAN3_2           | 28h | ok  | 29.16 | 4700 RPM
> FAN4_1           | 2Bh | ns  | 29.19 | No Reading
> FAN4_2           | 2Eh | ns  | 29.22 | No Reading
> FAN5_1           | 31h | ns  | 29.25 | No Reading
> FAN5_2           | 34h | ns  | 29.28 | No Reading
> FAN6_1           | 37h | ns  | 29.31 | No Reading
> FAN6_2           | 3Ah | ns  | 29.34 | No Reading
> FAN7_1           | 3Dh | ns  | 29.37 | No Reading
> FAN7_2           | 40h | ns  | 29.40 | No Reading
> FAN8_1           | 43h | ns  | 29.43 | No Reading
> FAN8_2           | 46h | ns  | 29.46 | No Reading
> PSU0_fan1        | F5h | ns  | 29.60 | No Reading
> PSU1_fan1        | F6h | ns  | 29.61 | No Reading
>
> real    2m43.704s
> user    0m0.046s
> sys    0m0.057s
>
> The cause of this increasement is when it failed to read one sensor 
> phosphor-hwmon keep trying to read the sensors with the retry is 10 
> and the 100ms delays between retry times.
>
> Should we reduce the retry for non-functional sensors?
>
>
> Regards.
>
> Thu Nguyen
Hi All,

Any feed back on this?

Thu Nguyen,



More information about the openbmc mailing list