[Skiboot] [PATCH v3] hw/xscom: Reset XSCOM engine after finite number of retries when busy

Vipin K Parashar vipin at linux.vnet.ibm.com
Mon Jun 6 16:21:42 AEST 2016


Thanks!! Mahesh and Vaidy for review.
I will sent out new revision addressing review comments.

Regards,
Vipin

On Friday 03 June 2016 11:29 AM, Vaidyanathan Srinivasan wrote:
> * Mahesh J Salgaonkar <mahesh at linux.vnet.ibm.com> [2016-06-03 11:16:30]:
>
>> On 2016-05-31 17:09:01 Tue, Vipin K Parashar wrote:
> [snip]
>
>> I understand that today we are not sure about how to verify whether xscom reset
>> has succeeded or not and hence this patch adds the delay. But I think we
>> should add FIXME comment here so that in future we can fix this
>> properly by talking to hardware folks to find out alternate way to
>> verify xscom reset completion status.
> Yes, we could add FIXME in comment until we can figure out the exact
> cycle checks from hardware team. But this is a very very rare recovery
> case/scenario and should not hold-up OPAL under normal operation.
>
>>> +			ts.tv_sec = 0;
>>> +			ts.tv_nsec = 10 * 1000;
>>> +			nanosleep_nopoll(&ts, NULL);
>>> +		}
>>> +
>>> +		/* Log error if we have retried enough and its still busy */
>>> +		if (retries == XSCOM_BUSY_MAX_RETRIES)
>>> +			log_simple_error(&e_info(OPAL_RC_XSCOM_BUSY),
>>> +				"XSCOM: %s-busy error gcid=0x%x pcb_addr=0x%x "
>>> +				"stat=0x%x\n", is_write ? "write" : "read",
>>> +				gcid, pcb_addr, stat);
>>>   		return OPAL_BUSY;
>>> +
>>>   	/* CPU is asleep, don't retry */
>>>   	case 2:
>>>   		return OPAL_WRONG_STATE;
>>> @@ -177,15 +210,16 @@ static bool xscom_gcid_ok(uint32_t gcid)
>>>    */
>>>   static int __xscom_read(uint32_t gcid, uint32_t pcb_addr, uint64_t *val)
>>>   {
>>> +	int i;
>>>   	uint64_t hmer;
>>> -	int64_t ret;
>>> +	int64_t ret, retries = 0;
>>>   
>>>   	if (!xscom_gcid_ok(gcid)) {
>>>   		prerror("%s: invalid XSCOM gcid 0x%x\n", __func__, gcid);
>>>   		return OPAL_PARAMETER;
>>>   	}
>>>   
>>> -	for (;;) {
>>> +	for (i = 0; i <= XSCOM_BUSY_MAX_RETRIES; i++) {
>          for (retries = 0; retries <= XSCOM_BUSY_MAX_RETRIES; retries++)
>
>>>   		/* Clear status bits in HMER (HMER is special
>>>   		 * writing to it *ands* bits
>>>   		 */
>>> @@ -199,27 +233,31 @@ static int __xscom_read(uint32_t gcid, uint32_t pcb_addr, uint64_t *val)
>>>   
>>>   		/* Check for error */
>>>   		if (!(hmer & SPR_HMER_XSCOM_FAIL))
>>> -			break;
>>> +			return OPAL_SUCCESS;
>>>   
>>>   		/* Handle error and possibly eventually retry */
>>> -		ret = xscom_handle_error(hmer, gcid, pcb_addr, false);
>>> -		if (ret == OPAL_HARDWARE || ret == OPAL_WRONG_STATE)
>>> -			return ret;
>>> +		ret = xscom_handle_error(hmer, gcid, pcb_addr, false, retries);
>>> +		if (ret != OPAL_BUSY)
>>> +			break;
>>> +		retries++;
>> I think we can get rid off extra variable here 'retries'. Just pass
>> variable 'i' to xscom_handle_error() which anyway gets incremented as
>> part of for loop.
> Good idea. How about keep the name retries, and remove i :)
>
> --Vaidy



More information about the Skiboot mailing list