[PATCH] PSeries: Cancel RTAS event scan before firmware flash

Suzuki Poulose suzuki at in.ibm.com
Mon Aug 8 16:46:42 EST 2011


On 07/27/11 17:39, Ravi K. Nittala wrote:
> The firmware flash update is conducted using an RTAS call, that is serialized
> by lock_rtas() which uses spin_lock. rtasd keeps scanning for the RTAS events
> generated on the machine. This is performed via a delayed workqueue, invoking
> an RTAS call to scan the events.
>
> The flash update takes a while to complete and during this time, any other
> RTAS call has to wait. In this case, rtas_event_scan() waits for a long time
> on the spin_lock resulting in a soft lockup.
>
> Approaches to fix the issue :
>
> Approach 1: Stop all the other CPUs before we start flashing the firmware.
>
> Before the rtas firmware update starts, all other CPUs should be stopped.
> Which means no other CPU should be in lock_rtas(). We do not want other CPUs
> execute while FW update is in progress and the system will be rebooted anyway
> after the update.
>
> --- arch/powerpc/kernel/setup-common.c.orig    2011-07-01 22:41:12.952507971 -0400
> +++ arch/powerpc/kernel/setup-common.c    2011-07-01 22:48:31.182507915 -0400
> @@ -109,11 +109,12 @@ void machine_shutdown(void)
>    void machine_restart(char *cmd)
>    {
>        machine_shutdown();
> -    if (ppc_md.restart)
> -        ppc_md.restart(cmd);
>    #ifdef CONFIG_SMP
> -    smp_send_stop();
> +        smp_send_stop();
>    #endif
> +    if (ppc_md.restart)
> +        ppc_md.restart(cmd);
> +
>        printk(KERN_EMERG "System Halted, OK to turn off power\n");
>        local_irq_disable();
>        while (1) ;
>
> Problems with this approach:
> Stopping the CPUs suddenly may cause other serious problems depending on what
> was running on them. Hence, this approach cannot be considered.
>
>
> Approach 2: Cancel the rtas_scan_event work before starting the firmware flash.
>
> Just before the flash update is performed, the queued rtas_event_scan() work
> item is cancelled from the work queue so that there is no other RTAS call
> issued while the flash is in progress. After the flash completes, the system
> reboots and the rtas_event_scan() is rescheduled.
>
> Approach 2 looks to be a better solution than Approach 1. Kindly let us know
> your thoughts. Patch attached.
>

Ben,

Could you please let us know your thoughts about the following patch ?

Thanks
Suzuki


>
> Signed-off-by: Suzuki Poulose<suzuki at in.ibm.com>
> Signed-off-by: Ravi Nittala<ravi.nittala at in.ibm.com>
>
>
> ---
>   arch/powerpc/include/asm/rtas.h  |    2 ++
>   arch/powerpc/kernel/rtas_flash.c |    6 ++++++
>   arch/powerpc/kernel/rtasd.c      |    6 ++++++
>   3 files changed, 14 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index 58625d1..3f26f87 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -245,6 +245,8 @@ extern int early_init_dt_scan_rtas(unsigned long node,
>
>   extern void pSeries_log_error(char *buf, unsigned int err_type, int fatal);
>
> +extern bool rtas_cancel_event_scan(void);
> +
>   /* Error types logged.  */
>   #define ERR_FLAG_ALREADY_LOGGED	0x0
>   #define ERR_FLAG_BOOT		0x1 	/* log was pulled from NVRAM on boot */
> diff --git a/arch/powerpc/kernel/rtas_flash.c b/arch/powerpc/kernel/rtas_flash.c
> index e037c74..4174b4b 100644
> --- a/arch/powerpc/kernel/rtas_flash.c
> +++ b/arch/powerpc/kernel/rtas_flash.c
> @@ -568,6 +568,12 @@ static void rtas_flash_firmware(int reboot_type)
>   	}
>
>   	/*
> +	 * Just before starting the firmware flash, cancel the event scan work
> +	 * to avoid any soft lockup issues.
> +	 */
> +	rtas_cancel_event_scan();
> +
> +	/*
>   	 * NOTE: the "first" block must be under 4GB, so we create
>   	 * an entry with no data blocks in the reserved buffer in
>   	 * the kernel data segment.
> diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
> index 481ef06..e8f03fa 100644
> --- a/arch/powerpc/kernel/rtasd.c
> +++ b/arch/powerpc/kernel/rtasd.c
> @@ -472,6 +472,12 @@ static void start_event_scan(void)
>   				&event_scan_work, event_scan_delay);
>   }
>
> +/* Cancel the rtas event scan work */
> +bool rtas_cancel_event_scan(void)
> +{
> +	return cancel_delayed_work_sync(&event_scan_work);
> +}
> +
>   static int __init rtas_init(void)
>   {
>   	struct proc_dir_entry *entry;
>



More information about the Linuxppc-dev mailing list