[PATCH 07/31] powerpc/xive: Fix xive_irq_set_affinity for MSI
Cédric Le Goater
clg at kaod.org
Fri May 21 03:25:06 AEST 2021
On 5/14/21 10:48 PM, Thomas Gleixner wrote:
> On Fri, Apr 30 2021 at 10:03, Cédric Le Goater wrote:
>> The MSI affinity is automanaged and it can be set before starting the
>> associated IRQ.
>>
>> ( Should we simply remove the irqd_is_started() test ? )
>
> If the hardware can handle it properly.
>
> But see:
>
> cffb717ceb8e ("powerpc/xive: Ensure active irqd when setting affinity")
Thanks for digging. That's a patch from the early days of XIVE support.
> which introduced that condition. It mutters something about migration of
> shutdown interrupts:
>
> [ 123.053037264,3] XIVE[ IC 00 ] ISN 2 lead to invalid IVE !
The XIVE driver in OPAL is complaining.
Linux is trying to configure the target of HW IRQ number 2 but OPAL refuses
because it's invalid. The first 16 are reserved (like on Linux).
So it's another problem. 2 could be a value from an "interrupts" property,
giving the INTx number assigned to a PCI device or an OPAL event IRQ
number leaked into the XIVE domain. Given the low Linux IRQ number that
might be the latter.
> [ 77.885859] xive: Error -6 reconfiguring irq 17
> [ 77.885862] IRQ17: set affinity failed(-6).
>
> Not that I can decode that :)
A device name would help but you have guessed most of it ;)
>
> Non-managed interrupts have the sequence:
>
> startup()
> set_affinity()
>
> which is historical and an earlier attempt to flip it caused havoc in
> some places.
>
> With managed we needed to make sure that the affinity is set correctly
> right at start. So it needs to be done the other way round and it turned
> out that for MSI this works.
>
> I have no idea, whether that might make the above issue reappear or
> not. If so, then we need some extra state to make it work.
>
> The root cause which triggered the problem got fixed, so there should be
> no issue _if_ this was specifically related to that CPU unplug case.
I would vote for this option. I will simply remove the irqd_is_started()
test which looks bogus and do some extra tests on all platforms.
Thanks,
C.
>> diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
>> index 96737938e8e3..3485baf9ec8c 100644
>> --- a/arch/powerpc/sysdev/xive/common.c
>> +++ b/arch/powerpc/sysdev/xive/common.c
>> @@ -710,7 +710,7 @@ static int xive_irq_set_affinity(struct irq_data *d,
>> return -EINVAL;
>>
>> /* Don't do anything if the interrupt isn't started */
>> - if (!irqd_is_started(d))
>> + if (!irqd_is_started(d) && !irqd_affinity_is_managed(d))
>> return IRQ_SET_MASK_OK;
>>
>> /*
>
> Thanks,
>
> tglx
>
More information about the Linuxppc-dev
mailing list