[Skiboot] [PATCH v2 2/2] mbox: Reduce default BMC timeouts
Nicholas Piggin
npiggin at gmail.com
Fri Mar 23 11:42:40 AEDT 2018
On Fri, 23 Mar 2018 10:56:52 +1100
Cyril Bur <cyril.bur at au1.ibm.com> wrote:
> Rebooting a BMC can take 70 seconds. Skiboot cannot possibly spin for
> 70 seconds waiting for a BMC to come back. This also makes the current
> default of 30 seconds a bit pointless, is it far too short to be a
> worse case wait time but too long to avoid hitting hardlockup detectors
> and wrecking havoc inside host linux.
>
> Just change it to three seconds so that host linux will survive and
> that, reads and writes will fail but at least the host stays up.
>
> Also refactored the waiting loop just a bit so that it's easier to read.
>
> Reported-by: Pridhiviraj Paidipeddi <ppaidipe at linux.vnet.ibm.com>
> Tested-by: Pridhiviraj Paidipeddi <ppaidipe at linux.vnet.ibm.com>
> Signed-off-by: Cyril Bur <cyril.bur at au1.ibm.com>
Thanks for splitting it out. There's one open-coded wait_for_bmc
timeout now that may be a bit out of date after this change to
defaults -- should that just be changed to use mbox_flash->timeout
too now?
Reviewed-by: Nicholas Piggin <npiggin at gmail.com>
> ---
> v2: Split this patch out from the other one and lowered the default
> timeout.
>
> libflash/mbox-flash.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/libflash/mbox-flash.c b/libflash/mbox-flash.c
> index 7bff34c4..dd1f11c2 100644
> --- a/libflash/mbox-flash.c
> +++ b/libflash/mbox-flash.c
> @@ -42,7 +42,7 @@
> /* Same technique as BUILD_BUG_ON from linux */
> #define CHECK_HANDLER_SIZE(handlers) ((void)sizeof(char[1 - 2*!!(ARRAY_SIZE(handlers) != (MBOX_COMMAND_COUNT + 1))]))
>
> -#define MBOX_DEFAULT_TIMEOUT 30
> +#define MBOX_DEFAULT_TIMEOUT 3 /* seconds */
>
> #define MSG_CREATE(init_command) { .command = init_command }
>
> @@ -334,15 +334,14 @@ static int wait_for_bmc(struct mbox_flash_data *mbox_flash, unsigned int timeout
> {
> unsigned long last = 1, start = tb_to_secs(mftb());
> prlog(PR_TRACE, "Waiting for BMC\n");
> - while (mbox_flash->busy && timeout_sec) {
> + while (mbox_flash->busy && timeout_sec > last) {
> long now = tb_to_secs(mftb());
> if (now - start > last) {
> - timeout_sec--;
> - last = now - start;
> if (last < timeout_sec / 2)
> prlog(PR_TRACE, "Been waiting for the BMC for %lu secs\n", last);
> else
> prlog(PR_ERR, "BMC NOT RESPONDING %lu second wait\n", last);
> + last++;
> }
> /*
> * Both functions are important.
More information about the Skiboot
mailing list