[Skiboot] [PATCH 13/51] libflash/ipmi-hiomap: Don't remove acked events unless we recover

Andrew Jeffery andrew at aj.id.au
Mon Feb 18 13:35:30 AEDT 2019


On Mon, 18 Feb 2019, at 11:38, Stewart Smith wrote:
> "Andrew Jeffery" <andrew at aj.id.au> writes:
> > On Fri, 15 Feb 2019, at 17:28, Andrew Jeffery wrote:
> >> Whilst we send the acks to the BMC, if they are removed before the
> >> recovery is complete we may wedge the host if the error was transient.
> >> There's no harm in acking already-acked events, and we'll need to send
> >> an ack anyway if another ack-able event has occurred in the meantime.
> >> 
> >> Signed-off-by: Andrew Jeffery <andrew at aj.id.au>
> >> ---
> >>  libflash/ipmi-hiomap.c | 9 ++++++++-
> >>  1 file changed, 8 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/libflash/ipmi-hiomap.c b/libflash/ipmi-hiomap.c
> >> index 86e47396efba..97d2a602af96 100644
> >> --- a/libflash/ipmi-hiomap.c
> >> +++ b/libflash/ipmi-hiomap.c
> >> @@ -637,7 +637,6 @@ static int ipmi_hiomap_handle_events(struct 
> >> ipmi_hiomap *ctx)
> >>  
> >>  	lock(&ctx->lock);
> >>  	status = ctx->bmc_state;
> >> -	ctx->bmc_state &= ~HIOMAP_E_ACK_MASK;
> >>  	unlock(&ctx->lock);
> >>  
> >>  	if (!(status & HIOMAP_E_DAEMON_READY)) {
> >> @@ -681,6 +680,10 @@ static int ipmi_hiomap_handle_events(struct 
> >> ipmi_hiomap *ctx)
> >>  			return rc;
> >>  		}
> >>  
> >> +		lock(&ctx->lock);
> >> +		ctx->bmc_state &= ~HIOMAP_E_PROTOCOL_RESET;
> >> +		unlock(&ctx->lock);
> >> +
> >>  		prlog(PR_INFO, "Restored window state after protocol reset\n");
> >>  		return 0;
> >>  	}
> >> @@ -694,6 +697,10 @@ static int ipmi_hiomap_handle_events(struct 
> >> ipmi_hiomap *ctx)
> >>  			return rc;
> >>  		}
> >>  
> >> +		lock(&ctx->lock);
> >> +		ctx->bmc_state &= ~HIOMAP_E_WINDOW_RESET;
> >> +		unlock(&ctx->lock);
> >> +
> >>  		prlog(PR_INFO, "Restored window state after window reset\n");
> >>  	}
> >>  
> >
> > I came up with a test case that I think will break this change. If we get a reset after recovering the window but before the clear the new reset will be lost. It's not a problem up to and including the window recovery because we'll get an error response and bail out before the clear.
> >
> > I'll fix this in a v2 after any other comments have been made.
> 
> Cool, I'll wait for v2, but should this patch also go to stable?

As discussed out of band, the whole series probably should. I'll
Cc stable for all patches in v2.

> 
> -- 
> Stewart Smith
> OPAL Architect, IBM.
> 


More information about the Skiboot mailing list