[PATCH] powernv/opal: Handle OPAL_WRONG_STATE error from OPAL fails
Stewart Smith
stewart at linux.vnet.ibm.com
Wed Feb 15 16:01:58 AEDT 2017
Michael Ellerman <mpe at ellerman.id.au> writes:
> Vipin K Parashar <vipin at linux.vnet.ibm.com> writes:
>
>> OPAL returns OPAL_WRONG_STATE for XSCOM operations
>>
>> done to read any core FIR which is sleeping, offline.
>
> OK.
>
> Do we know why Linux is causing that to happen?
>
> It's also returned from many of the XIVE routines if we're in the wrong
> xive mode, all of which would indicate a fairly bad Linux bug.
>
> Also the skiboot patch which added WRONG_STATE for XSCOM ops did so
> explicitly so we could differentiate from other errors:
>
> commit 9c2d82394fd2303847cac4a665dee62556ca528a
> Author: Russell Currey <ruscur at russell.cc>
> AuthorDate: Mon Mar 21 12:00:00 2016 +1100
>
> xscom: Return OPAL_WRONG_STATE on XSCOM ops if CPU is asleep
>
> xscom_read and xscom_write return OPAL_SUCCESS if they worked, and
> OPAL_HARDWARE if they didn't. This doesn't provide information about why
> the operation failed, such as if the CPU happens to be asleep.
>
> This is specifically useful in error scanning, so if every CPU is being
> scanned for errors, sleeping CPUs likely aren't the cause of failures.
>
> So, return OPAL_WRONG_STATE in xscom_read and xscom_write if the CPU is
> sleeping.
>
> Signed-off-by: Russell Currey <ruscur at russell.cc>
> Reviewed-by: Alistair Popple <alistair at popple.id.au>
> Signed-off-by: Stewart Smith <stewart at linux.vnet.ibm.com>
>
>
>
> So I'm still not convinced that quietly swallowing this error and
> mapping it to -EIO along with several of the other error codes is the
> right thing to do.
FWIW I agree - pretty limited cases where it should just be converted
into -EIO and passed on - probably *just* the debugfs interface to be honest.
--
Stewart Smith
OPAL Architect, IBM.
More information about the Linuxppc-dev
mailing list