[Skiboot] [PATCH] [v3] Fast reboot for P8

Stewart Smith stewart at linux.vnet.ibm.com
Tue Sep 27 18:00:05 AEST 2016


Benjamin Herrenschmidt <benh at kernel.crashing.org> writes:
> This is an experimental patch that implements "Fast reboot" on P8
> machines.
>
> The basic idea is that when the OS calls OPAL reboot, we gather all
> the threads in the system using a combination of patching the reset
> vector and soft-resetting them, then cleanup a few bits of hardware
> (we do re-probe PCIe for example), and reload & restart the bootloader.
>
> This is very experimental and needs a lot of testing and also auditing
> code for other bits of HW that might need to be cleaned up. I also need
> to check if we are properly PERST'ing PCI devices.
>
> I've successfully fast rebooted a Habanero a few times.
>
> This is partially based on old code I had to do that on P7. I only
> support it on P8 though as there are issues with the PSI interrupts
> on P7 that cannot be reliably solved.
>
> Not-yet-signed-off-by: Benjamin Herrenschmidt
> <benh at kernel.crashing.org>

what would be your criteria for going to signed-off-by ?

i'm thinking:
1) behind a config switch (I'm thinking
   nvram_query("experimental-fast-reboot")=="Danger-is-my-middle-name" or some
   such check)
2) CAPI disable ptaches from Andrew
3) Check for checkstops

Although skipping 2 and 3 because there's (1) (and not having it enabled
by default) could be a way forward.

Couple of quick thoughts below, not near complete review of course :)

> --- a/hw/fsp/fsp-leds.c
> +++ b/hw/fsp/fsp-leds.c
> @@ -1570,6 +1570,9 @@ void create_led_device_nodes(void)
>  	if (!pled)
>  		return;
>  
> +	/* Check if already populated (fast-reboot) */
> +	if (dt_has_node_property(pled, "compatible", NULL))
> +		return;

Any thoughts on a consistent way to do this that makes it relatively
obvious we're in fast reboot rather than IPL?

>  	dt_add_property_strings(pled, "compatible", DT_PROPERTY_LED_COMPATIBLE);
>  
>  	led_mode = dt_prop_get(pled, DT_PROPERTY_LED_MODE);
> diff --git a/hw/occ.c b/hw/occ.c
> index b606a67..3d86f7a 100644
> --- a/hw/occ.c
> +++ b/hw/occ.c
> @@ -517,10 +517,14 @@ void occ_pstates_init(void)
>  	struct proc_chip *chip;
>  	struct cpu_thread *c;
>  	s8 pstate_nom;
> +	static bool occ_pstates_initialized;
>  
>  	/* OCC is P8 only */
>  	if (proc_gen != proc_gen_p8)
>  		return;
> +	/* Handle fast reboots */
> +	if (occ_pstates_initialized)
> +		return;
>  
>  	chip = next_chip(NULL);
>  	if (!chip->homer_base) {
> @@ -558,6 +562,7 @@ void occ_pstates_init(void)
>  	for_each_chip(chip)
>  		chip->throttle = 0;
>  	opal_add_poller(occ_throttle_poll, NULL);
> +	occ_pstates_initialized = true;
>  }

We could reset the OCCs on fast reboot, as there is a PRD command to
disable them.. although nobody should *ever* use them....

FWIW I've been looking at moving OCC init even on openpower into skiboot
for the purposes of speeding up boot, so this could be "easy".

> diff --git a/hw/psi.c b/hw/psi.c
> index 3efc177..bb55c10 100644
> --- a/hw/psi.c
> +++ b/hw/psi.c
> @@ -432,34 +432,25 @@ static int64_t psi_p7_get_xive(struct irq_source *is, uint32_t isn __unused,
>  	return OPAL_SUCCESS;
>  }
>  
> +static const uint32_t psi_p8_irq_to_xivr[P8_IRQ_PSI_ALL_COUNT] = {
> +	[P8_IRQ_PSI_FSP]	= PSIHB_XIVR_FSP,
> +	[P8_IRQ_PSI_OCC]	= PSIHB_XIVR_OCC,
> +	[P8_IRQ_PSI_FSI]	= PSIHB_XIVR_FSI,
> +	[P8_IRQ_PSI_LPC]	= PSIHB_XIVR_LPC,
> +	[P8_IRQ_PSI_LOCAL_ERR]	= PSIHB_XIVR_LOCAL_ERR,
> +	[P8_IRQ_PSI_HOST_ERR]	= PSIHB_XIVR_HOST_ERR,
> +};
> +

This hunk had a conflict with master IIRC.

> --- a/include/config.h
> +++ b/include/config.h
> @@ -73,7 +73,7 @@
>  //#define FORCE_DUMMY_CONSOLE 1
>  
>  /* Enable this to do fast resets. Currently unreliable... */
> -//#define ENABLE_FAST_RESET	1
> +#define ENABLE_FAST_REBOOT	1

I think we could remove the config var and just do it via NVRAM setting.

>  /* Enable this to make fast reboot clear memory */
>  //#define FAST_REBOOT_CLEARS_MEMORY	1

We might want to do this, although that could come in future.. getting
all threads to clear a part of mem could be an easy way to do it
quickly.


-- 
Stewart Smith
OPAL Architect, IBM.



More information about the Skiboot mailing list