[Skiboot] [PATCH 3/3] xscom: Do not wait indefinitely for xscom operation to complete

Nicholas Piggin npiggin at gmail.com
Thu May 21 09:14:37 AEST 2020


Excerpts from Vasant Hegde's message of May 21, 2020 3:27 am:
> On 5/20/20 8:54 PM, Vasant Hegde wrote:
>> On 5/20/20 7:43 PM, Nicholas Piggin wrote:
>>> Excerpts from Vasant Hegde's message of May 20, 2020 9:51 pm:
>>>> In some corner cases (specially on P8), xscom_wait_done() won't
>>>> complete.. resulting in waiting forever.
>>>>
>>>> This patch retries for predefined time and then exits even if xscom
>>>> operation did not complete.
>>>>
>>>> Signed-off-by: Vasant Hegde <hegdevasant at linux.vnet.ibm.com>
>>>> ---
>>>>   hw/xscom.c | 7 +++++--
>>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/hw/xscom.c b/hw/xscom.c
>>>> index 0eda567fc..500b6c0e9 100644
>>>> --- a/hw/xscom.c
>>>> +++ b/hw/xscom.c
>>>> @@ -72,10 +72,13 @@ static inline void *xscom_addr(uint32_t gcid, uint32_t 
>>>> pcb_addr)
>>>>   static uint64_t xscom_wait_done(void)
>>>>   {
>>>>       uint64_t hmer;
>>>> +    int retry_cnt = 0;
>>>> -    do
>>>> +    do {
>>>>           hmer = mfspr(SPR_HMER);
>>>> -    while(!(hmer & SPR_HMER_XSCOM_DONE));
>>>> +        if (retry_cnt++ > 5)
>>>> +            break;
>>>
>>> AFAIKS you'd better return SPR_HMER_XSCOM_FAIL here.
>> 
>> Makes sense. Will fix it in v2.
> 
> We pass return value from this function to handle_error() .. which uses this 
> value for isolating error.
> So I think it makes sense to just return `hmer` value.

Not if XSCOM_FAIL does not show up. And is the xscom status valid done 
does not get set? If it is, then you could or XSCOM_FAIL in to hmer.
Although I don't know how xscoms are supposed to be used to recover from 
xscom not completing. So what exactly are we supposed to do to handle 
this error, and have you tested it?

Thanks,
Nick


More information about the Skiboot mailing list