[PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails

Stewart Smith stewart at linux.vnet.ibm.com
Wed Feb 15 16:01:58 AEDT 2017


Michael Ellerman <mpe at ellerman.id.au> writes:
> Vipin K Parashar <vipin at linux.vnet.ibm.com> writes:
>
>> OPAL returns OPAL_WRONG_STATE for XSCOM operations
>>
>> done to read any core FIR which is sleeping, offline.
>
> OK.
>
> Do we know why Linux is causing that to happen?
>
> It's also returned from many of the XIVE routines if we're in the wrong
> xive mode, all of which would indicate a fairly bad Linux bug.
>
> Also the skiboot patch which added WRONG_STATE for XSCOM ops did so
> explicitly so we could differentiate from other errors:
>
>     commit 9c2d82394fd2303847cac4a665dee62556ca528a
>     Author:     Russell Currey <ruscur at russell.cc>
>     AuthorDate: Mon Mar 21 12:00:00 2016 +1100
>
>     xscom: Return OPAL_WRONG_STATE on XSCOM ops if CPU is asleep
>     
>     xscom_read and xscom_write return OPAL_SUCCESS if they worked, and
>     OPAL_HARDWARE if they didn't.  This doesn't provide information about why
>     the operation failed, such as if the CPU happens to be asleep.
>     
>     This is specifically useful in error scanning, so if every CPU is being
>     scanned for errors, sleeping CPUs likely aren't the cause of failures.
>     
>     So, return OPAL_WRONG_STATE in xscom_read and xscom_write if the CPU is
>     sleeping.
>     
>     Signed-off-by: Russell Currey <ruscur at russell.cc>
>     Reviewed-by: Alistair Popple <alistair at popple.id.au>
>     Signed-off-by: Stewart Smith <stewart at linux.vnet.ibm.com>
>
>
>
> So I'm still not convinced that quietly swallowing this error and
> mapping it to -EIO along with several of the other error codes is the
> right thing to do.

FWIW I agree - pretty limited cases where it should just be converted
into -EIO and passed on - probably *just* the debugfs interface to be honest.

-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Linuxppc-dev mailing list