[PATCH linux dev-4.10 v2] drivers (fsi): sbefifo: Add timeout for waiting for data

Andrew Jeffery andrew at aj.id.au
Mon Oct 30 16:26:11 AEDT 2017


On Thu, 2017-10-26 at 17:34 -0500, Eddie James wrote:
> From: "Edward A. James" <eajames at us.ibm.com>
> 
> During a transfer, in the event that the SBE isn't running, the driver
> will wait for data to appear in the FIFO forever. This doesn't work at
> BMC boot time if any transfers are attempted (by OCC driver probing, for
> example), as it will just hang the boot. So add a simple timeout
> mechanism when the driver reschdules waiting for data to show up in the
> FIFO.
> 
> Tested on witherspoon by rebooting the BMC successfully with chassis
> power on.

So the thing is the BMC *does* successfully boot *without* this patch. It just
takes an *insanely* long time at around 870 seconds before we hand over to
init.

So it seems there's an existing timeout mechanism in place, but this patch adds
a new one on top. Your commit message implies to me that the BMC should never
complete the boot process under these conditions, but given it does, can you
outline (in an reply and in the commit message) why this new timeout is the
right solution?

I tried to understand the behaviour myself but did not find an obvious
justification for 870 seconds.

Cheers,

Andrew

> 
> Signed-off-by: Edward A. James <eajames at us.ibm.com>
> ---
> Changes since v1:
>  * Reset the wait_data_timeout field if we get data. If we get data, the SBE
>    is running and we really shouldn't timeout.
> 
>  drivers/fsi/fsi-sbefifo.c | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
> 
> diff --git a/drivers/fsi/fsi-sbefifo.c b/drivers/fsi/fsi-sbefifo.c
> index f756822..41b838e 100644
> --- a/drivers/fsi/fsi-sbefifo.c
> +++ b/drivers/fsi/fsi-sbefifo.c
> @@ -54,6 +54,7 @@
>  #define SBEFIFO_EOT_ACK		0x14
>  
>  #define SBEFIFO_RESCHEDULE	msecs_to_jiffies(500)
> +#define SBEFIFO_MAX_RESCHDULE	msecs_to_jiffies(5000)
>  
>  struct sbefifo {
>  	struct timer_list poll_timer;
> @@ -77,6 +78,7 @@ struct sbefifo_buf {
>  };
>  
>  struct sbefifo_xfr {
> +	unsigned long wait_data_timeout;
>  	struct sbefifo_buf *rbuf;
>  	struct sbefifo_buf *wbuf;
>  	struct list_head client;
> @@ -450,11 +452,27 @@ static void sbefifo_poll_timer(unsigned long data)
>  
>  		devn = sbefifo_dev_nwreadable(sts);
>  		if (devn == 0) {
> +			/*
> +			 * Limit the maximum waiting period for data in the
> +			 * FIFO. If the SBE isn't running, we will wait
> +			 * forever.
> +			 */
> +			if (!xfr->wait_data_timeout) {
> +				xfr->wait_data_timeout =
> +					jiffies + SBEFIFO_MAX_RESCHDULE;
> +			} else if (time_after(jiffies,
> +					      xfr->wait_data_timeout)) {
> +				ret = -ETIME;
> +				goto out;
> +			}
> +
>  			/* No data yet.  Reschedule. */
>  			sbefifo->poll_timer.expires = jiffies +
>  				SBEFIFO_RESCHEDULE;
>  			add_timer(&sbefifo->poll_timer);
>  			goto out_unlock;
> +		} else {
> +			xfr->wait_data_timeout = 0;
>  		}
>  
>  		/* Fill.  The EOT word is discarded.  */
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: This is a digitally signed message part
URL: <http://lists.ozlabs.org/pipermail/openbmc/attachments/20171030/ef23ff5e/attachment-0001.sig>


More information about the openbmc mailing list