[PATCH] tools/lib/perf: Fix -Werror=alloc-size-larger-than in cpumap.c

Likhitha Korrapati likhitha at linux.ibm.com
Fri May 2 19:05:54 AEST 2025


Hi Arnaldo,

On 4/25/25 23:16, Arnaldo Carvalho de Melo wrote:
> On Fri, Apr 25, 2025 at 08:19:02PM +0530, Athira Rajeev wrote:
>>> On 14 Apr 2025, at 7:08 AM, Madhavan Srinivasan <maddy at linux.ibm.com> wrote:
>>> On 4/7/25 5:38 PM, Venkat Rao Bagalkote wrote:
>>>> On 07/04/25 12:10 am, Athira Rajeev wrote:
>>>>>> On 6 Apr 2025, at 10:04 PM, Likhitha Korrapati <likhitha at linux.ibm.com> wrote:
> 
>>>>>> perf build break observed when using gcc 13-3 (FC39 ppc64le)
>>>>>> with the following error.
> 
>>>>>> cpumap.c: In function 'perf_cpu_map__merge':
>>>>>> cpumap.c:414:20: error: argument 1 range [18446744069414584320, 18446744073709551614] exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=]
>>>>>>    414 |         tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
>>>>>>        |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>>> In file included from cpumap.c:4:
>>>>>> /usr/include/stdlib.h:672:14: note: in a call to allocation function 'malloc' declared here
>>>>>>    672 | extern void *malloc (size_t __size) __THROW __attribute_malloc__
>>>>>>        |              ^~~~~~
>>>>>> cc1: all warnings being treated as errors
> 
>>>>>> Error happens to be only in gcc13-3 and not in latest gcc 14.
>>>>>> Even though git-bisect pointed bad commit as:
>>>>>> 'commit f5b07010c13c ("libperf: Don't remove -g when EXTRA_CFLAGS are used")',
>>>>>> issue is with tmp_len being "int". It holds number of cpus and making
>>>>>> it "unsigned int" fixes the issues.
> 
>>>>>> After the fix:
> 
>>>>>>    CC      util/pmu-flex.o
>>>>>>    CC      util/expr-flex.o
>>>>>>    LD      util/perf-util-in.o
>>>>>>    LD      perf-util-in.o
>>>>>>    AR      libperf-util.a
>>>>>>    LINK    perf
>>>>>>    GEN     python/perf.cpython-312-powerpc64le-linux-gnu.so
> 
>>>>>> Signed-off-by: Likhitha Korrapati <likhitha at linux.ibm.com>
>>>>> Looks good to me
> 
>>>>> Reviewed-by: Athira Rajeev <atrajeev at linux.ibm.com>
> 
>>>> Tested this patch on perf-tools-next repo, and this patch fixes the issue.
> 
>>>> Tested-by: Venkat Rao Bagalkote <venkat88 at linux.ibm.com>
> 
>>> Arnaldo, Namhyung,
> 
>>> can you consider pulling this fix? since it is breaking the build in gcc13-3 or
>>> if you have any comments do let us know.
> 
> This isn't the only place in that file where this pattern exists:
> 
> ⬢ [acme at toolbx perf-tools-next]$ grep malloc tools/lib/perf/cpumap.c
> 	cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
> 	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> 	tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> ⬢ [acme at toolbx perf-tools-next]$
> 
> 
> struct perf_cpu_map *perf_cpu_map__alloc(int nr_cpus)
> {
>          RC_STRUCT(perf_cpu_map) *cpus;
>          struct perf_cpu_map *result;
> 
>          if (nr_cpus == 0)
>                  return NULL;
> 
>          cpus = malloc(sizeof(*cpus) + sizeof(struct perf_cpu) * nr_cpus);
> 
> 
> int perf_cpu_map__merge(struct perf_cpu_map **orig, struct perf_cpu_map *other)
> {
>          struct perf_cpu *tmp_cpus;
>          int tmp_len;
>          int i, j, k;
>          struct perf_cpu_map *merged;
> 
>          if (perf_cpu_map__is_subset(*orig, other))
>                  return 0;
>          if (perf_cpu_map__is_subset(other, *orig)) {
>                  perf_cpu_map__put(*orig);
>                  *orig = perf_cpu_map__get(other);
>                  return 0;
>          }
> 
>          tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other);
>          tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> 
> 
> struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
>                                               struct perf_cpu_map *other)
> {
>          struct perf_cpu *tmp_cpus;
>          int tmp_len;
>          int i, j, k;
>          struct perf_cpu_map *merged = NULL;
> 
>          if (perf_cpu_map__is_subset(other, orig))
>                  return perf_cpu_map__get(orig);
>          if (perf_cpu_map__is_subset(orig, other))
>                  return perf_cpu_map__get(other);
> 
>          tmp_len = max(__perf_cpu_map__nr(orig), __perf_cpu_map__nr(other));
>          tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> 
> I'm trying to figure out why its only in perf_cpu_map__merge() that this
> triggers :-\
> 
> Maybe that max() call in perf_cpu_map__intersect() somehow makes the
> compiler happy.
> 
> And in perf_cpu_map__alloc() all calls seems to validate it.
> 
> But wouldn't turning this into a calloc() be better?
> 
> Like:
> 
> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> index 4454a5987570cfbc..99d21618a252ac0e 100644
> --- a/tools/lib/perf/cpumap.c
> +++ b/tools/lib/perf/cpumap.c
> @@ -411,7 +411,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig, struct perf_cpu_map *other)
>          }
>   
>          tmp_len = __perf_cpu_map__nr(*orig) + __perf_cpu_map__nr(other);
> -       tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
> +       tmp_cpus = calloc(tmp_len, sizeof(struct perf_cpu));
>          if (!tmp_cpus)
>                  return -ENOMEM;
>   
> ⬢ [acme at toolbx perf-tools-next]$
> 
> 
> And better, do the max size that the compiler is trying to help us
> catch?
> 
> - Arnaldo

I am not sure if using max() is right incase of perf_cpu_map_merge as 
this is a merge functionality. As this is summation we get a value 
greater than the max of orig and other cpus. And this might be correct 
in perf_cpu_map__intersect() but will cause issues in perf_cpu_map__marge().
Can you please eloborate if you meant this or there is something else I 
missed.

I tried the following:

int perf_cpu_map__nr(const struct perf_cpu_map *cpus)
{
         return cpus ? __perf_cpu_map__nr(cpus) : 1;
}

static int __perf_cpu_map__nr(const struct perf_cpu_map *cpus)
{
         return RC_CHK_ACCESS(cpus)->nr;
}

This got introduced via commit 7d1b529f164d33ad4514b272bcec65036873d717 
where it assumes cpu map as non-null.

And we are checking this non-null in perf_cpu_map__merge()

         if (perf_cpu_map__is_subset(*orig, other))
                 return 0;
         if (perf_cpu_map__is_subset(other, *orig)) {
                 perf_cpu_map__put(*orig);
                 *orig = perf_cpu_map__get(other);
                 return 0;
         }

Using perf_cpu_map__nr instead of __perf_cpu_map__nr  and this works as 
it has a check.

--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -410,7 +410,7 @@ int perf_cpu_map__merge(struct perf_cpu_map **orig, 
struct perf_cpu_map *other)
                 return 0;
         }

-       tmp_len = max(__perf_cpu_map__nr(*orig), __perf_cpu_map__nr(other));
+       tmp_len = perf_cpu_map__nr(*orig) +  perf_cpu_map__nr(other);
         tmp_cpus = malloc(tmp_len * sizeof(struct perf_cpu));
         if (!tmp_cpus)
                 return -ENOMEM;
Regards,
Likhitha.


More information about the Linuxppc-dev mailing list