[PATCH linux dev-4.10 v3] drivers/hwmon/occ: Add sysfs_notify on error

Eddie James eajames at linux.vnet.ibm.com
Sat Jul 8 05:36:46 AEST 2017



On 07/07/2017 01:37 PM, Patrick Williams wrote:
> On Fri, Jul 07, 2017 at 04:06:30PM +0930, Joel Stanley wrote:
>> On Fri, Jul 7, 2017 at 12:58 AM, Eddie James <eajames at linux.vnet.ibm.com> wrote:
>>> From: "Edward A. James" <eajames at us.ibm.com>
>>> +void occ_set_error(struct occ *occ, int error)
>>> +{
>>> +       occ->error_count++;
>>> +       if (occ->error_count > OCC_ERROR_COUNT_THRESHOLD)
>> I think this policy can live in userspace. Instead of counting the
>> errors in the kernel and only reporting after some threshold, report
>> all of the errors.
>>
>> Userspace can then chose to do with this information what it wants,
>> according to the policy of the system.
>>
> This retry / threshold count is in the OCC specification itself and is
> not considered "policy of the system".  I believe the specification
> requires not considering errors under a threshold count as actual
> errors and if there is a successful communication the threshold count is
> suppose to be reset.  How do we coordinate that with userspace keeping a
> separate error policy?

In addition, there are a variety of different error types reported from 
the driver, such as "OCC in safe state for one minute" and "OCCs present 
mismatch". Those shouldn't be a part of the error counting, according to 
the OCC spec.

Thanks,
Eddie

>
>



More information about the openbmc mailing list