[Skiboot] [PATCH 1/2] opal/xscom: Move the delay inside xscom_reset() function.

Mahesh Jagannath Salgaonkar mahesh at linux.vnet.ibm.com
Fri Dec 8 16:43:52 AEDT 2017


On 12/08/2017 09:12 AM, Nicholas Piggin wrote:
> On Thu, 07 Dec 2017 21:43:00 +0530
> Mahesh J Salgaonkar <mahesh at linux.vnet.ibm.com> wrote:
> 
>> From: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
>>
>> So caller of xscom_reset() does not have to bother about adding a delay
>> separately.
> 
> Hmm, this does change behaviour a bit. I suppose it is harmless because
> it's just adding small delays in error paths. But it seems that the delay
> was only needed for busy xscom, and only to delay the retry of the same
> xscom. Whereas xscom_reset gets called in more situations.
> 
> Does the xscom_clear_error in your next patch also require some delays?

Yes, it does. I struggled for couple of days to get invalid address for
clearing write to 20010800 immediately after xscom_reset(). Without this
delay the write was responding with scom done with no error, which was
completely confusing as there is no register at 0x20010800. With the
delay things started working as expected. It looks like delay is very
much required if we are doing next scom operation immediately after the
reset.

> I wouldn't mind having those delays explicitly commented in there.

I see your point. If there is no scom operation immediately after the
reset we can do away with the delay. In the normal scom failure other
than busy scom and clear scom we don;t need this delay. How about adding
'need_delay' as a second argument to the xscom_reset(gcid,
need_delay=<true|false>) function to control the delay instead of adding
delays multiple places explicitly ? Do you think that should should work
? I will respin v2 with the change if it's ok.

Thanks for your review.

-Mahesh.

> 
> Thanks,
> Nick
> 
>>
>> Signed-off-by: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
>> ---
>>  hw/xscom.c |   27 ++++++++++++++-------------
>>  1 file changed, 14 insertions(+), 13 deletions(-)
>>
>> diff --git a/hw/xscom.c b/hw/xscom.c
>> index 5b3bd88..2621465 100644
>> --- a/hw/xscom.c
>> +++ b/hw/xscom.c
>> @@ -96,6 +96,7 @@ static void xscom_reset(uint32_t gcid)
>>  {
>>  	u64 hmer;
>>  	uint32_t recv_status_reg, log_reg, err_reg;
>> +	struct timespec ts;
>>  
>>  	/* Clear errors in HMER */
>>  	mtspr(SPR_HMER, HMER_CLR_MASK);
>> @@ -126,6 +127,19 @@ static void xscom_reset(uint32_t gcid)
>>  	hmer = xscom_wait_done();
>>  	if (hmer & SPR_HMER_XSCOM_FAIL)
>>  		goto fail;
>> +
>> +	/*
>> +	 * Its observed that sometimes immediate retry of
>> +	 * XSCOM operation returns wrong data. Adding a
>> +	 * delay for XSCOM reset to be effective. Delay of
>> +	 * 10 ms is found to be working fine experimentally.
>> +	 * FIXME: Replace 10ms delay by exact delay needed
>> +	 * or other alternate method to confirm XSCOM reset
>> +	 * completion, after checking from HW folks.
>> +	 */
>> +	ts.tv_sec = 0;
>> +	ts.tv_nsec = 10 * 1000;
>> +	nanosleep_nopoll(&ts, NULL);
>>  	return;
>>   fail:
>>  	/* Fatal error resetting XSCOM */
>> @@ -140,7 +154,6 @@ static void xscom_reset(uint32_t gcid)
>>  static int64_t xscom_handle_error(uint64_t hmer, uint32_t gcid, uint32_t pcb_addr,
>>  			      bool is_write, int64_t retries)
>>  {
>> -	struct timespec ts;
>>  	unsigned int stat = GETFIELD(SPR_HMER_XSCOM_STATUS, hmer);
>>  	int64_t rc = OPAL_HARDWARE;
>>  
>> @@ -160,18 +173,6 @@ static int64_t xscom_handle_error(uint64_t hmer, uint32_t gcid, uint32_t pcb_add
>>  				XSCOM_BUSY_RESET_THRESHOLD, retries);
>>  			xscom_reset(gcid);
>>  
>> -			/*
>> -			 * Its observed that sometimes immediate retry of
>> -			 * XSCOM operation returns wrong data. Adding a
>> -			 * delay for XSCOM reset to be effective. Delay of
>> -			 * 10 ms is found to be working fine experimentally.
>> -			 * FIXME: Replace 10ms delay by exact delay needed
>> -			 * or other alternate method to confirm XSCOM reset
>> -			 * completion, after checking from HW folks.
>> -			 */
>> -			ts.tv_sec = 0;
>> -			ts.tv_nsec = 10 * 1000;
>> -			nanosleep_nopoll(&ts, NULL);
>>  		}
>>  
>>  		/* Log error if we have retried enough and its still busy */
>>
>> _______________________________________________
>> Skiboot mailing list
>> Skiboot at lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/skiboot
> 



More information about the Skiboot mailing list