[PATCH] powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if PMI is pending

Nicholas Piggin npiggin at gmail.com
Wed Jan 19 15:24:30 AEDT 2022


Excerpts from Athira Rajeev's message of January 15, 2022 5:20 pm:
> Running selftest with CONFIG_PPC_IRQ_SOFT_MASK_DEBUG enabled in kernel
> triggered below warning:
> 
> [  172.851380] ------------[ cut here ]------------
> [  172.851391] WARNING: CPU: 8 PID: 2901 at arch/powerpc/include/asm/hw_irq.h:246 power_pmu_disable+0x270/0x280
> [  172.851402] Modules linked in: dm_mod bonding nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink sunrpc xfs libcrc32c pseries_rng xts vmx_crypto uio_pdrv_genirq uio sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse
> [  172.851442] CPU: 8 PID: 2901 Comm: lost_exception_ Not tainted 5.16.0-rc5-03218-g798527287598 #2
> [  172.851451] NIP:  c00000000013d600 LR: c00000000013d5a4 CTR: c00000000013b180
> [  172.851458] REGS: c000000017687860 TRAP: 0700   Not tainted  (5.16.0-rc5-03218-g798527287598)
> [  172.851465] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 48004884  XER: 20040000
> [  172.851482] CFAR: c00000000013d5b4 IRQMASK: 1
> [  172.851482] GPR00: c00000000013d5a4 c000000017687b00 c000000002a10600 0000000000000004
> [  172.851482] GPR04: 0000000082004000 c0000008ba08f0a8 0000000000000000 00000008b7ed0000
> [  172.851482] GPR08: 00000000446194f6 0000000000008000 c00000000013b118 c000000000d58e68
> [  172.851482] GPR12: c00000000013d390 c00000001ec54a80 0000000000000000 0000000000000000
> [  172.851482] GPR16: 0000000000000000 0000000000000000 c000000015d5c708 c0000000025396d0
> [  172.851482] GPR20: 0000000000000000 0000000000000000 c00000000a3bbf40 0000000000000003
> [  172.851482] GPR24: 0000000000000000 c0000008ba097400 c0000000161e0d00 c00000000a3bb600
> [  172.851482] GPR28: c000000015d5c700 0000000000000001 0000000082384090 c0000008ba0020d8
> [  172.851549] NIP [c00000000013d600] power_pmu_disable+0x270/0x280
> [  172.851557] LR [c00000000013d5a4] power_pmu_disable+0x214/0x280
> [  172.851565] Call Trace:
> [  172.851568] [c000000017687b00] [c00000000013d5a4] power_pmu_disable+0x214/0x280 (unreliable)
> [  172.851579] [c000000017687b40] [c0000000003403ac] perf_pmu_disable+0x4c/0x60
> [  172.851588] [c000000017687b60] [c0000000003445e4] __perf_event_task_sched_out+0x1d4/0x660
> [  172.851596] [c000000017687c50] [c000000000d1175c] __schedule+0xbcc/0x12a0
> [  172.851602] [c000000017687d60] [c000000000d11ea8] schedule+0x78/0x140
> [  172.851608] [c000000017687d90] [c0000000001a8080] sys_sched_yield+0x20/0x40
> [  172.851615] [c000000017687db0] [c0000000000334dc] system_call_exception+0x18c/0x380
> [  172.851622] [c000000017687e10] [c00000000000c74c] system_call_common+0xec/0x268
> 
> The warning indicates that MSR_EE being set(interrupt enabled) when
> there was an overflown PMC detected. This could happen in
> power_pmu_disable since it runs under interrupt soft disable
> condition ( local_irq_save ) and not with interrupts hard disabled.
> commit 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear
> pending PMI before resetting an overflown PMC") intended to clear
> PMI pending bit in Paca when disabling the PMU. It could happen
> that PMC gets overflown while code is in power_pmu_disable
> callback function. Hence add a check to see if PMI pending bit
> is set in Paca before clearing it via clear_pmi_pending.
> 
> Fixes: 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC")
> Signed-off-by: Athira Rajeev <atrajeev at linux.vnet.ibm.com>
> Reported-by: Sachin Sant <sachinp at linux.ibm.com>
> Tested-by: Sachin Sant <sachinp at linux.ibm.com>
> ---
> Note: Address the warning reported here:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-December/238190.html
> Patch is on top of powerpc/merge
> 
>  arch/powerpc/perf/core-book3s.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index a684901b6965..dfb0ea0f0df3 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -1327,9 +1327,10 @@ static void power_pmu_disable(struct pmu *pmu)
>  		 * Otherwise provide a warning if there is PMI pending, but
>  		 * no counter is found overflown.
>  		 */
> -		if (any_pmc_overflown(cpuhw))
> -			clear_pmi_irq_pending();
> -		else
> +		if (any_pmc_overflown(cpuhw)) {
> +			if (pmi_irq_pending())
> +				clear_pmi_irq_pending();

And this works because MSR[EE] gets disabled by the masked interrupt 
handler if we have a PMI irq pending, is that right?

Can I be a pain and just ask for a little comment there otherwise I'll
forget about it next time.

Thanks,
Nick

> +		} else
>  			WARN_ON(pmi_irq_pending());
>  
>  		val = mmcra = cpuhw->mmcr.mmcra;
> -- 
> 2.33.0
> 
> 


More information about the Linuxppc-dev mailing list