[PATCH linux dev-4.10 v3] drivers/hwmon/occ: Add sysfs_notify on error

Lei YU mine260309 at gmail.com
Thu Jul 20 00:40:13 AEST 2017


Tested-by: Lei YU <mine260309 at gmail.com>

Without this patch, the occ hwmon sensor readings are all empty on P8.
With this patch the issue is fixed and all the readings are OK.


On Sat, Jul 8, 2017 at 3:36 AM, Eddie James <eajames at linux.vnet.ibm.com> wrote:
>
>
> On 07/07/2017 01:37 PM, Patrick Williams wrote:
>>
>> On Fri, Jul 07, 2017 at 04:06:30PM +0930, Joel Stanley wrote:
>>>
>>> On Fri, Jul 7, 2017 at 12:58 AM, Eddie James <eajames at linux.vnet.ibm.com>
>>> wrote:
>>>>
>>>> From: "Edward A. James" <eajames at us.ibm.com>
>>>> +void occ_set_error(struct occ *occ, int error)
>>>> +{
>>>> +       occ->error_count++;
>>>> +       if (occ->error_count > OCC_ERROR_COUNT_THRESHOLD)
>>>
>>> I think this policy can live in userspace. Instead of counting the
>>> errors in the kernel and only reporting after some threshold, report
>>> all of the errors.
>>>
>>> Userspace can then chose to do with this information what it wants,
>>> according to the policy of the system.
>>>
>> This retry / threshold count is in the OCC specification itself and is
>> not considered "policy of the system".  I believe the specification
>> requires not considering errors under a threshold count as actual
>> errors and if there is a successful communication the threshold count is
>> suppose to be reset.  How do we coordinate that with userspace keeping a
>> separate error policy?
>
>
> In addition, there are a variety of different error types reported from the
> driver, such as "OCC in safe state for one minute" and "OCCs present
> mismatch". Those shouldn't be a part of the error counting, according to the
> OCC spec.
>
> Thanks,
> Eddie
>
>>
>>
>


More information about the openbmc mailing list