[Skiboot] [PATCH stable 5.1] hw/phb3: set PHB retry state correctly when fresetting during a creset

Andrew Donnellan andrew.donnellan at au1.ibm.com
Wed Nov 16 19:39:03 AEDT 2016


On 16/09/16 16:51, Stewart Smith wrote:
> Andrew Donnellan <andrew.donnellan at au1.ibm.com> writes:
>> When we're doing a complete reset, after we complete the ETU reset and wait
>> for the PHB to return, we need to do a fundamental reset.
>>
>> When we do the fundamental reset, we poll for a link up. This isn't always
>> successful on the first attempt. In phb3_sm_link_poll(), if we time out
>> while waiting for the link to come up, we call phb3_retry_state() to reset
>> p->state back to p->retry_state and poll again. On the second poll, we
>> clear the retry state so we don't retry again.
>>
>> However, when we do the fundamental reset as part of a complete reset, we
>> don't explicitly set the retry state. This means that we only retry if
>> there wasn't an earlier fundamental reset that had to retry and thus
>> cleared the retry state. This reduces the reliability of the complete reset
>> process.
>>
>> In phb3_sm_complete_reset(), when in state PHB3_STATE_CRESET_FRESET,
>> set the retry state to PHB3_STATE_FRESET_START, as is done in
>> phb3_fundamental_reset().
>>
>> Reported-by: Pradipta Ghosh <pradghos at in.ibm.com>
>> Suggested-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>> Cc: Uma Krishnan <ukrishn at linux.vnet.ibm.com>
>> Cc: Matthew Ochs <mrochs at linux.vnet.ibm.com>
>> Signed-off-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
>>
>> ---
>>
>> Very lightly tested - I've asked Pradipta and Uma to test this
>> properly and ensure this gets rid of their problem.
>
> Okay - I'll hold off merging until I see a Tested-by.
>
>> This is based off 5.1.18. The code in question has been
>> rewritten in 5.3.
>
> Okay, when merge time comes, I'll pull it into 5.1.x only.

Pradipta/Uma - we should get this merged. I've made some more comments 
on the IBM internal bug (SW359608/BZ144323) regarding the further 
occurrences of reset failures that we haven't figured out yet, but I 
don't see a reason to not merge this patch.

-- 
Andrew Donnellan              OzLabs, ADL Canberra
andrew.donnellan at au1.ibm.com  IBM Australia Limited



More information about the Skiboot mailing list