[PATCH v2] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

Vipin K Parashar vipin at linux.vnet.ibm.com
Sun Mar 5 20:48:09 AEDT 2017



On Thursday 02 March 2017 06:00 PM, Vipin K Parashar wrote:
> Hi Stewart/Michael,
>
> Thanks!! for review.
>
> Responses as below:
>
>
> On Wednesday 01 March 2017 02:38 AM, Stewart Smith wrote:
>> Vipin K Parashar <vipin at linux.vnet.ibm.com> writes:
>>> Added check for OPAL_WRONG_STATE error code returned from OPAL.
>>> Currently Linux flashes "unexpected error" over console for this
>>> error. This will avoid throwing such message and return I/O error
>>> for such OPAL failures.
>>>
>>> Signed-off-by: Vipin K Parashar <vipin at linux.vnet.ibm.com>
>>> ---
>>> Changes in v2:
>>>   - Added log message indicating sleeping/offline core
>>>     for OPAL_WRONG_STATE
>>>
>>>   arch/powerpc/platforms/powernv/opal.c | 5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/platforms/powernv/opal.c 
>>> b/arch/powerpc/platforms/powernv/opal.c
>>> index 86d9fde..8af230e 100644
>>> --- a/arch/powerpc/platforms/powernv/opal.c
>>> +++ b/arch/powerpc/platforms/powernv/opal.c
>>> @@ -869,8 +869,11 @@ int opal_error_code(int rc)
>>>       case OPAL_UNSUPPORTED:        return -EIO;
>>>       case OPAL_HARDWARE:        return -EIO;
>>>       case OPAL_INTERNAL_ERROR:    return -EIO;
>>> +    case OPAL_WRONG_STATE:
>>> +        pr_notice("%s: Core sleeping/offline\n", __func__);
>>> +        return -EIO;
>> Since this is part of opal_error_code() though, this will be printed for
>> any OPAL call that returns that.
>
> opal_error_coder is used by functions to handle OPAL error codes
> and return Linux error codes. Apart from opal_get_sensor_data ()
> in opal-sensor.c, opal_error_code  is also getting invoked from
> opal_get_sys_param( ) in opal-sysparam.c.
>
> Handling OPAL_WRONG_STATE in opal_error_code itself, seems
> modular  and avoids extra checks for OPAL_WRONG_STATE after
> opal_error_code usage in multiple functions.
>
> opal_error_code is already adding a message upon OPAL_WRONG_STATE
> return, so its already leaving trace about Sleeping core causing XSCOM
> failure. By returning OPAL_WRONG_CODE from opal_error_code are we
> planning some action like on-lining back the sleeping or off-lined core ?
>
>>
>> Why not have the sensor code do this:
>>
>> rc = opal_sensor_read(foo)
>> if (rc == OPAL_WRONG_STATE)
>>     return -EIO;
>> else
>>     return oal_error_code(rc);
>>
>> ?
>>

Avoided adding OPAL_WRONG_STATE check in opal_error_code
and instead added a new case for OPAL_WRONG_STATE in
opal_get_sensor_data itself. Sending out v3 with the changes.

>>>       default:
>>> -        pr_err("%s: unexpected OPAL error %d\n", __func__, rc);
>>> +        pr_err("%s: Unexpected OPAL error %d\n", __func__, rc);
>> Do we need this?
>>
>
> This print helps in alerting about OPAL return codes that aren't 
> supported in
> running Linux version. Helpful in catching OPAL return code that missed
> out detection check in Linux.
> Shall we consider reducing message severity from pr_err to pr_warn ?
>



More information about the Linuxppc-dev mailing list