[Skiboot] [PATCH] IPMI: Trigger attention in abort path.

Oliver O'Halloran oohall at gmail.com
Thu Oct 24 15:48:38 AEDT 2019


On Thu, Oct 24, 2019 at 3:21 AM Mahesh Salgaonkar
<mahesh at linux.vnet.ibm.com> wrote:
>
> OpenBMC is capable of catching attn instruction as TI and facilitate in
> rebooting (IPL-ing) host while keeping a reboot counter. This functionality
> was not present with other BMCs e.g. SMC and AMI. And hence OPAL never used
> to trigger an attn during abort/assert path for BMC based system. Instead
> it always triggered normal reboot during abort. This means that BMC never
> gets notified about OPAL termination/reboot. This sometimes leads into
> never ending IPL-ing loop if OPAL keeps aborting very early in boot path.
> This can be avoided on OpenBMC system that supports handling of TI (attn
> instruction).

What versions of openbmc support the feature? I'm going to assume that
it's not all of them.

> With AutoReboot policy, OpenBMC handles TIs (attn instruction) and counts
> them against the reboot counter. In cases where OPAL is crashing before
> host reaches to runtime, OpenBMC will move the system in Quiesced state
> after 3 or so attempts of IPL/reboot so that system can be debugged.
>
> This patch triggers an attn on OpenBMC system to inform BMC about the OPAL
> termination. When system is moved to Quiesced state by BMC, it does not
> makes sure that all CPU threads are also quiesced. Hence, make sure to
> move all secondaries into quiesced state before calling attn.
>
> Signed-off-by: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
> ---
>  hw/ipmi/ipmi-attn.c       |   32 ++++++++++++++++++++++++++------
>  include/platform.h        |    1 +
>  platforms/astbmc/common.c |    1 +
>  3 files changed, 28 insertions(+), 6 deletions(-)
>
> diff --git a/hw/ipmi/ipmi-attn.c b/hw/ipmi/ipmi-attn.c
> index 3a615189d..4b5e9ca89 100644
> --- a/hw/ipmi/ipmi-attn.c
> +++ b/hw/ipmi/ipmi-attn.c
> @@ -14,6 +14,7 @@
>  #include <skiboot.h>
>  #include <stack.h>
>  #include <timebase.h>
> +#include <direct-controls.h>
>
>  /* Use same attention SRC for BMC based machine */
>  DEFINE_LOG_ENTRY(OPAL_RC_ATTN, OPAL_PLATFORM_ERR_EVT,
> @@ -67,18 +68,37 @@ void __attribute__((noreturn)) ipmi_terminate(const char *msg)
>          */
>         p9_sbe_terminate();
>
> -       /* Terminate called before initializing IPMI (early abort) */
> -       if (!ipmi_present()) {
> -               if (platform.cec_reboot)
> -                       platform.cec_reboot();
> -               goto out;
> +       /*
> +        * if BMC has attn support then let BMC know that we are terminating by
> +        * triggering attn so that BMC will decide whether to reboot/IPL or not
> +        * depending on AutoReboot policy.  This helps in cases where OPAL is
> +        * crashing/terminating before host reaches to runtime. With BMC
> +        * AutoReboot policy, in such cases, it will make sure that system is
> +        * moved to Quiesced state after 3 or so attempts to IPL. Without
> +        * `attn` call BMC will never know that OPAL is terminating and system
> +        * would go into never ending IPL'ing loop.
> +        *
> +        * When BMC moves the system into Quiesced state, it does not make sure
> +        * that all CPU threads are also quiesced. Hence, make sure to move all
> +        * secondaries into quiesced state before calling attn.
> +        *
> +        * Once the system reaches to runtime BMC resets the boot counter.
> +        * Hence next time when BMC receieves the attn it will IPL the system
> +        * if AutoReboot is enabled. We don't need to worry about self
> +        * rebooting
> +        */
> +
> +       if (platform.bmc->sw->attn_supported) {
> +               /* Put everybody in stop/quiesce except myself. */
> +               sreset_all_prepare();
> +               trigger_attn();
> +               for (;;) ;

Make it an msleep for a second or two and fall back to
platform.cec_reboot(). We shouldn't be assuming the BMC will handle
the attn.

It might be a good idea to quiesce the secondaries regardless. Is
there any drawbacks to that?

>         }
>
>         /* Reboot call */
>         if (platform.cec_reboot)
>                 platform.cec_reboot();
>
> -out:
>         while (1)
>                 time_wait_ms(100);
>  }
> diff --git a/include/platform.h b/include/platform.h
> index 0b043856b..1fdc07fb8 100644
> --- a/include/platform.h
> +++ b/include/platform.h
> @@ -39,6 +39,7 @@ struct bmc_sw_config {
>         uint32_t ipmi_oem_partial_add_esel;
>         uint32_t ipmi_oem_pnor_access_status;
>         uint32_t ipmi_oem_hiomap_cmd;
> +       uint32_t attn_supported;
>  };
>
>  struct bmc_platform {
> diff --git a/platforms/astbmc/common.c b/platforms/astbmc/common.c
> index 15ac231fb..0a1cc86b0 100644
> --- a/platforms/astbmc/common.c
> +++ b/platforms/astbmc/common.c
> @@ -512,6 +512,7 @@ const struct bmc_sw_config bmc_sw_ami = {
>  const struct bmc_sw_config bmc_sw_openbmc = {
>         .ipmi_oem_partial_add_esel   = IPMI_CODE(0x3a, 0xf0),
>         .ipmi_oem_hiomap_cmd         = IPMI_CODE(0x3a, 0x5a),
> +       .attn_supported              = 1,
>  };
>
>  /* Extracted from a Palmetto */
>
> _______________________________________________
> Skiboot mailing list
> Skiboot at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot


More information about the Skiboot mailing list