[PATCH] powerpc/perf: Fix for core/nest imc call trace on cpuhotplug

Anju T Sudhakar anju at linux.vnet.ibm.com
Mon Sep 25 21:52:01 AEST 2017


Hi mpe,


On Thursday 21 September 2017 10:04 AM, Michael Ellerman wrote:
> Anju T Sudhakar <anju at linux.vnet.ibm.com> writes:
>
>> Nest/core pmu units are enabled only when it is used. A reference count is
>> maintained for the events which uses the nest/core pmu units. Currently in
>> *_imc_counters_release function a WARN() is used for notification of any
>> underflow of ref count. Replace WARN() with a pr_info since it is an overkill.
> As discussed elsewhere this is not the right solution.
>
> If it's OK for the reference count to be negative, then we shouldn't
> print anything when it is.
>
> But I don't understand how it can be OK for the refcount to be negative.
> That means someone has a negative number of references to something?
>
> cheers
>

Scenario where this happens is in a stress test where perf session is 
started, followed by offlining of all cpus in a given core. And finally 
terminate the perf session.

So, in cpuhotplug offline path(ppc_core_imc_cpu_offline), function set 
the ref->count to zero, if the current cpu which is about to offline is 
the last cpu in a given core and make an OPAL call to disable the engine 
in that core.
And on perf session termination, perf->destory 
(core_imc_counters_release) will first decrement the ref->count for this 
core and based on the ref->count value an opal call is made to disable 
the core-imc engine.

Now, since cpuhotplug path already clears the ref->count for core and 
disabled the engine, perf->destroy() decrementing again at event 
termination make it negative which in turn fires the WARN_ON.

So we do prefer to remove the message as this wont happen in normal 
operation and the core counters are working as expected.
I will send out a patch by removing the message asap.


Thanks,
Anju




More information about the Linuxppc-dev mailing list