[PATCH] powerpc/xive: Do not skip CPU-less nodes when creating the IPIs
Cédric Le Goater
clg at kaod.org
Fri Aug 6 21:50:43 AEST 2021
On 6/29/21 3:15 PM, Cédric Le Goater wrote:
> On PowerVM, CPU-less nodes can be populated with hot-plugged CPUs at
> runtime. Today, the IPI is not created for such nodes, and hot-plugged
> CPUs use a bogus IPI, which leads to soft lockups.
>
> We could create the node IPI on demand but it is a bit complex because
> this code would be called under bringup_up() and some IRQ locking is
> being done. The simplest solution is to create the IPIs for all nodes
> at startup.
>
> Fixes: 7dcc37b3eff9 ("powerpc/xive: Map one IPI interrupt per node")
> Cc: stable at vger.kernel.org # v5.13
> Reported-by: Geetika Moolchandani <Geetika.Moolchandani1 at ibm.com>
> Cc: Srikar Dronamraju <srikar at linux.vnet.ibm.com>
> Signed-off-by: Cédric Le Goater <clg at kaod.org>
> ---
>
> This patch breaks old versions of irqbalance (<= v1.4). Possible nodes
> are collected from /sys/devices/system/node/ but CPU-less nodes are
> not listed there. When interrupts are scanned, the link representing
> the node structure is NULL and segfault occurs.
This is an irqbalance regression due to :
https://github.com/Irqbalance/irqbalance/pull/172
I will report through an issue.
Anyhow, there is a better approach which is to allocate IPIs for all
nodes at boot time and do the mapping on demand. Removing the mapping
on last use seems more complex though.
I will send a v2 after some tests.
Thanks,
C.
> Version 1.7 seems immune.
>
> ---
> arch/powerpc/sysdev/xive/common.c | 4 ----
> 1 file changed, 4 deletions(-)
>
> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
> index f3b16ed48b05..5d2c58dba57e 100644
> --- a/arch/powerpc/sysdev/xive/common.c
> +++ b/arch/powerpc/sysdev/xive/common.c
> @@ -1143,10 +1143,6 @@ static int __init xive_request_ipi(void)
> struct xive_ipi_desc *xid = &xive_ipis[node];
> struct xive_ipi_alloc_info info = { node };
>
> - /* Skip nodes without CPUs */
> - if (cpumask_empty(cpumask_of_node(node)))
> - continue;
> -
> /*
> * Map one IPI interrupt per node for all cpus of that node.
> * Since the HW interrupt number doesn't have any meaning,
>
More information about the Linuxppc-dev
mailing list