[Skiboot] [PATCH v2] hw/xscom: Reset XSCOM after finite number of retries when busy

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Wed May 18 20:07:55 AEST 2016


On 05/16/2016 09:35 PM, Vipin K Parashar wrote:
> OPAL retries XSCOM read/write operations forever till it succeeds.
> It can cause XSCOM ops to hang, if XSCOM remains busy for some reason,
> Changed it to retry XSCOM operations XSCOM_BUSY_MAX_RETRIES number of
> times. Also added logic to reset XSCOM after XSCOM_BUSY_RESET_THRESHOLD
> number of retries to unblock it, if its still busy.
>
> Signed-off-by: Vipin K Parashar <vipin at linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
> ---

.../...

> @@ -127,9 +131,26 @@ static int xscom_handle_error(uint64_t hmer, uint32_t gcid, uint32_t pcb_addr,
>   	 * recovery procedures
>   	 */
>   	switch(stat) {
> -	/* XSCOM blocked, just retry */
> +	/*
> +	 * XSCOM blocked, need to retry. Reset XSCOM after
> +	 * crossing retry threshold before retrying again.
> +	 */
>   	case 1:
> +		if (retries && !(retries  % XSCOM_BUSY_RESET_THRESHOLD)) {
> +			prlog(PR_NOTICE, "XSCOM: Busy!! Resetting after %d "
> +				"retries, Total retries  = %lld\n",
> +				XSCOM_BUSY_RESET_THRESHOLD, retries);
> +			xscom_reset(gcid);
> +		}
> +
> +		/* Log error if we have retried enough and its still busy */
> +		if (retries == XSCOM_BUSY_MAX_RETRIES)
> +			log_simple_error(&e_info(OPAL_RC_XSCOM_BUSY),
> +				"XSCOM: %s-busy error gcid=0x%x pcb_addr=0x%x "
> +				"stat=0x%x\n", is_write ? "write" : "read",
> +				gcid, pcb_addr, stat);

We are trying to update SRC documentation.

What should be the repair action here? Do you expect customer to replace parts 
or something?

-Vasant



More information about the Skiboot mailing list