[Skiboot] [PATCH] IPMI: Trigger attention in abort path.

Vasant Hegde hegdevasant at linux.vnet.ibm.com
Thu Oct 24 03:46:10 AEDT 2019


On 10/23/19 9:50 PM, Mahesh Salgaonkar wrote:
> OpenBMC is capable of catching attn instruction as TI and facilitate in
> rebooting (IPL-ing) host while keeping a reboot counter. This functionality
> was not present with other BMCs e.g. SMC and AMI. And hence OPAL never used
> to trigger an attn during abort/assert path for BMC based system. Instead
> it always triggered normal reboot during abort. This means that BMC never
> gets notified about OPAL termination/reboot. This sometimes leads into
> never ending IPL-ing loop if OPAL keeps aborting very early in boot path.
> This can be avoided on OpenBMC system that supports handling of TI (attn
> instruction).
> 
> With AutoReboot policy, OpenBMC handles TIs (attn instruction) and counts
> them against the reboot counter. In cases where OPAL is crashing before
> host reaches to runtime, OpenBMC will move the system in Quiesced state
> after 3 or so attempts of IPL/reboot so that system can be debugged.
> 
> This patch triggers an attn on OpenBMC system to inform BMC about the OPAL
> termination. When system is moved to Quiesced state by BMC, it does not
> makes sure that all CPU threads are also quiesced. Hence, make sure to
> move all secondaries into quiesced state before calling attn.
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh at linux.vnet.ibm.com>
> ---
>   hw/ipmi/ipmi-attn.c       |   32 ++++++++++++++++++++++++++------
>   include/platform.h        |    1 +
>   platforms/astbmc/common.c |    1 +
>   3 files changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/ipmi/ipmi-attn.c b/hw/ipmi/ipmi-attn.c
> index 3a615189d..4b5e9ca89 100644
> --- a/hw/ipmi/ipmi-attn.c
> +++ b/hw/ipmi/ipmi-attn.c
> @@ -14,6 +14,7 @@
>   #include <skiboot.h>
>   #include <stack.h>
>   #include <timebase.h>
> +#include <direct-controls.h>
> 
>   /* Use same attention SRC for BMC based machine */
>   DEFINE_LOG_ENTRY(OPAL_RC_ATTN, OPAL_PLATFORM_ERR_EVT,
> @@ -67,18 +68,37 @@ void __attribute__((noreturn)) ipmi_terminate(const char *msg)
>   	 */
>   	p9_sbe_terminate();
> 
> -	/* Terminate called before initializing IPMI (early abort) */
> -	if (!ipmi_present()) {
> -		if (platform.cec_reboot)
> -			platform.cec_reboot();
> -		goto out;
> +	/*
> +	 * if BMC has attn support then let BMC know that we are terminating by
> +	 * triggering attn so that BMC will decide whether to reboot/IPL or not
> +	 * depending on AutoReboot policy.  This helps in cases where OPAL is
> +	 * crashing/terminating before host reaches to runtime. With BMC
> +	 * AutoReboot policy, in such cases, it will make sure that system is
> +	 * moved to Quiesced state after 3 or so attempts to IPL. Without
> +	 * `attn` call BMC will never know that OPAL is terminating and system
> +	 * would go into never ending IPL'ing loop.
> +	 *
> +	 * When BMC moves the system into Quiesced state, it does not make sure
> +	 * that all CPU threads are also quiesced. Hence, make sure to move all
> +	 * secondaries into quiesced state before calling attn.
> +	 *
> +	 * Once the system reaches to runtime BMC resets the boot counter.
> +	 * Hence next time when BMC receieves the attn it will IPL the system
> +	 * if AutoReboot is enabled. We don't need to worry about self
> +	 * rebooting
> +	 */
> +
> +	if (platform.bmc->sw->attn_supported) {
> +		/* Put everybody in stop/quiesce except myself. */
> +		sreset_all_prepare();

I think we should validate return value from sreset_all_prepare(). If we fails 
to quiesce the CPU then
its better to reboot.

> +		trigger_attn();
> +		for (;;) ;
>   	}
> 
>   	/* Reboot call */
>   	if (platform.cec_reboot)
>   		platform.cec_reboot();
> 
> -out:
>   	while (1)
>   		time_wait_ms(100);
>   }
> diff --git a/include/platform.h b/include/platform.h
> index 0b043856b..1fdc07fb8 100644
> --- a/include/platform.h
> +++ b/include/platform.h
> @@ -39,6 +39,7 @@ struct bmc_sw_config {
>   	uint32_t ipmi_oem_partial_add_esel;
>   	uint32_t ipmi_oem_pnor_access_status;
>   	uint32_t ipmi_oem_hiomap_cmd;
> +	uint32_t attn_supported;

Better bool ?

-Vasant



More information about the Skiboot mailing list