[PATCH v2] powerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus()
ppaidipe
ppaidipe at linux.vnet.ibm.com
Wed May 16 16:48:18 AEST 2018
On 2018-05-16 12:05, Anju T Sudhakar wrote:
> Currently memory is allocated for core-imc based on cpu_present_mask,
> which has bit 'cpu' set iff cpu is populated. We use (cpu number /
> threads
> per core) as the array index to access the memory.
>
> Under some circumstances firmware marks a CPU as GUARDed CPU and boot
> the
> system, until cleared of errors, these CPU's are unavailable for all
> subsequent boots. GUARDed CPUs are possible but not present from linux
> view, so it blows a hole when we assume the max length of our
> allocation
> is driven by our max present cpus, where as one of the cpus might be
> online
> and be beyond the max present cpus, due to the hole.
> So (cpu number / threads per core) value bounds the array index and
> leads
> to memory overflow.
>
> Call trace observed during a guard test:
>
> Faulting instruction address: 0xc000000000149f1c
> cpu 0x69: Vector: 380 (Data Access Out of Range) at [c000003fea303420]
> pc:c000000000149f1c: prefetch_freepointer+0x14/0x30
> lr:c00000000014e0f8: __kmalloc+0x1a8/0x1ac
> sp:c000003fea3036a0
> msr:9000000000009033
> dar:c9c54b2c91dbf6b7
> current = 0xc000003fea2c0000
> paca = 0xc00000000fddd880 softe: 3 irq_happened: 0x01
> pid = 1, comm = swapper/104
> Linux version 4.16.7-openpower1 (smc at smc-desktop) (gcc version 6.4.0
> (Buildroot 2018.02.1-00006-ga8d1126)) #2 SMP Fri May 4 16:44:54 PDT
> 2018
> enter ? for help
> call trace:
> __kmalloc+0x1a8/0x1ac
> (unreliable)
> init_imc_pmu+0x7f4/0xbf0
> opal_imc_counters_probe+0x3fc/0x43c
> platform_drv_probe+0x48/0x80
> driver_probe_device+0x22c/0x308
> __driver_attach+0xa0/0xd8
> bus_for_each_dev+0x88/0xb4
> driver_attach+0x2c/0x40
> bus_add_driver+0x1e8/0x228
> driver_register+0xd0/0x114
> __platform_driver_register+0x50/0x64
> opal_imc_driver_init+0x24/0x38
> do_one_initcall+0x150/0x15c
> kernel_init_freeable+0x250/0x254
> kernel_init+0x1c/0x150
> ret_from_kernel_thread+0x5c/0xc8
>
> Allocating memory for core-imc based on cpu_possible_mask, which has
> bit 'cpu' set iff cpu is populatable, will fix this issue.
>
> Reported-by: Pridhiviraj Paidipeddi <ppaidipe at linux.vnet.ibm.com>
> Signed-off-by: Anju T Sudhakar <anju at linux.vnet.ibm.com>
> Reviewed-by: Balbir Singh <bsingharora at gmail.com>
I have verified this fix with both normal and kexec boot multiple times
when
system is having GARDed cores. Not seen any crash/memory corruption
issues with
this.
Tested-by: Pridhiviraj Paidipeddi <ppaidipe at linux.vnet.ibm.com>
> ---
> arch/powerpc/perf/imc-pmu.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
> index d7532e7..75fb23c 100644
> --- a/arch/powerpc/perf/imc-pmu.c
> +++ b/arch/powerpc/perf/imc-pmu.c
> @@ -1146,7 +1146,7 @@ static int init_nest_pmu_ref(void)
>
> static void cleanup_all_core_imc_memory(void)
> {
> - int i, nr_cores = DIV_ROUND_UP(num_present_cpus(), threads_per_core);
> + int i, nr_cores = DIV_ROUND_UP(num_possible_cpus(),
> threads_per_core);
> struct imc_mem_info *ptr = core_imc_pmu->mem_info;
> int size = core_imc_pmu->counter_mem_size;
>
> @@ -1264,7 +1264,7 @@ static int imc_mem_init(struct imc_pmu *pmu_ptr,
> struct device_node *parent,
> if (!pmu_ptr->pmu.name)
> return -ENOMEM;
>
> - nr_cores = DIV_ROUND_UP(num_present_cpus(), threads_per_core);
> + nr_cores = DIV_ROUND_UP(num_possible_cpus(), threads_per_core);
> pmu_ptr->mem_info = kcalloc(nr_cores, sizeof(struct imc_mem_info),
> GFP_KERNEL);
More information about the Linuxppc-dev
mailing list