phosphor-ipmi-flash state recovery
Patrick_Voelker at phoenix.com
Sat May 23 08:57:48 AEST 2020
Do you have a timeline for upstreaming your changes?
10 minutes seems rather excessive for a session timeout. I’d prefer to see it set to something more like 30 seconds except for the update process itself if it involves writing to flash.
Alternately, what would you think of this option?
· Add a ‘force’ command line override to burn_my_bmc that would cause it to attempt to obtain the current SessionID and do a close on it if the specified image is already active.
· On the BMC side, I’d need to a method to obtain the session ID for a blob. It’d have been perfect to return that in BmcBlobStat’s response but maybe a new command called BmcBlobSession would be appropriate?
From: William Kennington [mailto:wak at google.com]
Sent: Friday, May 22, 2020 3:15 PM
To: Patrick Voelker
Cc: OpenBMC (openbmc at lists.ozlabs.org); Patrick Venture; Benjamin Fair
Subject: Re: phosphor-ipmi-flash state recovery
I was working on a change to fix this a couple of weeks ago and implement the needed expiry / cancellation mechanisms to make everything happy. Right now the easiest thing you can is reset the BMC or just the ipmi daemon.
On Fri, May 22, 2020 at 3:06 PM Patrick Voelker <Patrick_Voelker at phoenix.com<mailto:Patrick_Voelker at phoenix.com>> wrote:
When running burn_my_bmc, if I exit the program during image upload with ctrl-c, it seems that the bmc gets left in a state that is difficult to recover from.
When attempting to run the update again I can see that the /flash/active/image blob is present. burn_my_bmc opens the cleanup blob, commits it, and then closes it but the state doesn’t change. I don’t have the cleanup-delete option enabled but it doesn’t look like that cleans up the state anyhow.
Internally, it looks like I need to get to abortProcess() but to do that I need to close the current session but I don’t have a way to obtain the sessionID after the fact. Also the stale session doesn’t seem to expire (as mentioned in the readme.md) and I can’t find the support for that in the code.
Can you give me a pointer on the best known way to recover from this scenario without rebooting the BMC?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the openbmc