[Skiboot] [PATCH v3] hw/xscom: Reset XSCOM engine after finite number of retries when busy

Vaidyanathan Srinivasan svaidy at linux.vnet.ibm.com
Fri Jun 3 15:59:49 AEST 2016


* Mahesh J Salgaonkar <mahesh at linux.vnet.ibm.com> [2016-06-03 11:16:30]:

> On 2016-05-31 17:09:01 Tue, Vipin K Parashar wrote:

[snip]

> I understand that today we are not sure about how to verify whether xscom reset
> has succeeded or not and hence this patch adds the delay. But I think we
> should add FIXME comment here so that in future we can fix this
> properly by talking to hardware folks to find out alternate way to
> verify xscom reset completion status.

Yes, we could add FIXME in comment until we can figure out the exact
cycle checks from hardware team. But this is a very very rare recovery
case/scenario and should not hold-up OPAL under normal operation.
 
> > +			ts.tv_sec = 0;
> > +			ts.tv_nsec = 10 * 1000;
> > +			nanosleep_nopoll(&ts, NULL);
> > +		}
> > +
> > +		/* Log error if we have retried enough and its still busy */
> > +		if (retries == XSCOM_BUSY_MAX_RETRIES)
> > +			log_simple_error(&e_info(OPAL_RC_XSCOM_BUSY),
> > +				"XSCOM: %s-busy error gcid=0x%x pcb_addr=0x%x "
> > +				"stat=0x%x\n", is_write ? "write" : "read",
> > +				gcid, pcb_addr, stat);
> >  		return OPAL_BUSY;
> > +
> >  	/* CPU is asleep, don't retry */
> >  	case 2:
> >  		return OPAL_WRONG_STATE;
> > @@ -177,15 +210,16 @@ static bool xscom_gcid_ok(uint32_t gcid)
> >   */
> >  static int __xscom_read(uint32_t gcid, uint32_t pcb_addr, uint64_t *val)
> >  {
> > +	int i;
> >  	uint64_t hmer;
> > -	int64_t ret;
> > +	int64_t ret, retries = 0;
> >  
> >  	if (!xscom_gcid_ok(gcid)) {
> >  		prerror("%s: invalid XSCOM gcid 0x%x\n", __func__, gcid);
> >  		return OPAL_PARAMETER;
> >  	}
> >  
> > -	for (;;) {
> > +	for (i = 0; i <= XSCOM_BUSY_MAX_RETRIES; i++) {

        for (retries = 0; retries <= XSCOM_BUSY_MAX_RETRIES; retries++)

> >  		/* Clear status bits in HMER (HMER is special
> >  		 * writing to it *ands* bits
> >  		 */
> > @@ -199,27 +233,31 @@ static int __xscom_read(uint32_t gcid, uint32_t pcb_addr, uint64_t *val)
> >  
> >  		/* Check for error */
> >  		if (!(hmer & SPR_HMER_XSCOM_FAIL))
> > -			break;
> > +			return OPAL_SUCCESS;
> >  
> >  		/* Handle error and possibly eventually retry */
> > -		ret = xscom_handle_error(hmer, gcid, pcb_addr, false);
> > -		if (ret == OPAL_HARDWARE || ret == OPAL_WRONG_STATE)
> > -			return ret;
> > +		ret = xscom_handle_error(hmer, gcid, pcb_addr, false, retries);
> > +		if (ret != OPAL_BUSY)
> > +			break;
> > +		retries++;
> 
> I think we can get rid off extra variable here 'retries'. Just pass
> variable 'i' to xscom_handle_error() which anyway gets incremented as
> part of for loop.

Good idea. How about keep the name retries, and remove i :)

--Vaidy



More information about the Skiboot mailing list