[PATCH v2] powerpc/kexec: Fix orphaned offline CPUs across kexec
Michael Neuling
mikey at neuling.org
Fri Jul 30 10:08:32 EST 2010
In message <4C511216.30109 at ozlabs.org> you wrote:
> When CPU hotplug is used, some CPUs may be offline at the time a kexec is
> performed. The subsequent kernel may expect these CPUs to be already running
,
> and will declare them stuck. On pseries, there's also a soft-offline (cede)
> state that CPUs may be in; this can also cause problems as the kexeced kernel
> may ask RTAS if they're online -- and RTAS would say they are. Again, stuck.
>
> This patch kicks each present offline CPU awake before the kexec, so that
> none are lost to these assumptions in the subsequent kernel.
There are a lot of cleanups in this patch. The change you are making
would be a lot clearer without all the additional cleanups in there. I
think I'd like to see this as two patches. One for cleanups and one for
the addition of wake_offline_cpus().
Other than that, I'm not completely convinced this is the functionality
we want. Do we really want to online these cpus? Why where they
offlined in the first place? I understand the stuck problem, but is the
solution to online them, or to change the device tree so that the second
kernel doesn't detect them as stuck?
Mikey
>
> Signed-off-by: Matt Evans <matt at ozlabs.org>
> ---
> v2: Added FIXME comment noting a possible problem with incorrectly
> started secondary CPUs, following feedback from Milton.
>
> arch/powerpc/kernel/machine_kexec_64.c | 55 ++++++++++++++++++++++++++++--
-
> 1 files changed, 49 insertions(+), 6 deletions(-)
>
> diff --git a/arch/powerpc/kernel/machine_kexec_64.c b/arch/powerpc/kernel/mac
hine_kexec_64.c
> index 4fbb3be..37f805e 100644
> --- a/arch/powerpc/kernel/machine_kexec_64.c
> +++ b/arch/powerpc/kernel/machine_kexec_64.c
> @@ -15,6 +15,8 @@
> #include <linux/thread_info.h>
> #include <linux/init_task.h>
> #include <linux/errno.h>
> +#include <linux/kernel.h>
> +#include <linux/cpu.h>
>
> #include <asm/page.h>
> #include <asm/current.h>
> @@ -181,7 +183,20 @@ static void kexec_prepare_cpus_wait(int wait_state)
> int my_cpu, i, notified=-1;
>
> my_cpu = get_cpu();
> - /* Make sure each CPU has atleast made it to the state we need */
> + /* Make sure each CPU has at least made it to the state we need.
> + *
> + * FIXME: There is a (slim) chance of a problem if not all of the CPUs
> + * are correctly onlined. If somehow we start a CPU on boot with RTAS
> + * start-cpu, but somehow that CPU doesn't write callin_cpu_map[] in
> + * time, the boot CPU will timeout. If it does eventually execute
> + * stuff, the secondary will start up (paca[].cpu_start was written) an
d
> + * get into a peculiar state. If the platform supports
> + * smp_ops->take_timebase(), the secondary CPU will probably be spinnin
g
> + * in there. If not (i.e. pseries), the secondary will continue on and
> + * try to online itself/idle/etc. If it survives that, we need to find
> + * these possible-but-not-online-but-should-be CPUs and chaperone them
> + * into kexec_smp_wait().
> + */
> for_each_online_cpu(i) {
> if (i == my_cpu)
> continue;
> @@ -189,9 +204,9 @@ static void kexec_prepare_cpus_wait(int wait_state)
> while (paca[i].kexec_state < wait_state) {
> barrier();
> if (i != notified) {
> - printk( "kexec: waiting for cpu %d (physical"
> - " %d) to enter %i state\n",
> - i, paca[i].hw_cpu_id, wait_state);
> + printk(KERN_INFO "kexec: waiting for cpu %d "
> + "(physical %d) to enter %i state\n",
> + i, paca[i].hw_cpu_id, wait_state);
> notified = i;
> }
> }
> @@ -199,9 +214,32 @@ static void kexec_prepare_cpus_wait(int wait_state)
> mb();
> }
>
> -static void kexec_prepare_cpus(void)
> +/*
> + * We need to make sure each present CPU is online. The next kernel will sc
an
> + * the device tree and assume primary threads are online and query secondary
> + * threads via RTAS to online them if required. If we don't online primary
> + * threads, they will be stuck. However, we also online secondary threads a
s we
> + * may be using 'cede offline'. In this case RTAS doesn't see the secondary
> + * threads as offline -- and again, these CPUs will be stuck.
> + *
> + * So, we online all CPUs that should be running, including secondary thread
s.
> + */
> +static void wake_offline_cpus(void)
> {
> + int cpu = 0;
>
> + for_each_present_cpu(cpu) {
> + if (!cpu_online(cpu)) {
> + printk(KERN_INFO "kexec: Waking offline cpu %d.\n",
> + cpu);
> + cpu_up(cpu);
> + }
> + }
> +}
> +
> +static void kexec_prepare_cpus(void)
> +{
> + wake_offline_cpus();
> smp_call_function(kexec_smp_down, NULL, /* wait */0);
> local_irq_disable();
> mb(); /* make sure IRQs are disabled before we say they are */
> @@ -215,7 +253,10 @@ static void kexec_prepare_cpus(void)
> if (ppc_md.kexec_cpu_down)
> ppc_md.kexec_cpu_down(0, 0);
>
> - /* Before removing MMU mapings make sure all CPUs have entered real mod
e */
> + /*
> + * Before removing MMU mappings make sure all CPUs have entered real
> + * mode:
> + */
> kexec_prepare_cpus_wait(KEXEC_STATE_REAL_MODE);
>
> put_cpu();
> @@ -284,6 +325,8 @@ void default_machine_kexec(struct kimage *image)
> if (crashing_cpu == -1)
> kexec_prepare_cpus();
>
> + pr_debug("kexec: Starting switchover sequence.\n");
> +
> /* switch to a staticly allocated stack. Based on irq stack code.
> * XXX: the task struct will likely be invalid once we do the copy!
> */
> --
> 1.6.3.3
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
More information about the Linuxppc-dev
mailing list