[PATCH v2] PCI/AER: Handle Multi UnCorrectable/Correctable errors properly
Sathyanarayanan Kuppuswamy
sathyanarayanan.kuppuswamy at linux.intel.com
Wed Mar 16 04:26:46 AEDT 2022
On 3/15/22 10:14 AM, Eric Badger wrote:
>> # Prep injection data for a correctable error.
>> $ cd /sys/kernel/debug/apei/einj
>> $ echo 0x00000040 > error_type
>> $ echo 0x4 > flags
>> $ echo 0x891000 > param4
>>
>> # Root Error Status is initially clear
>> $ setpci -s <Dev ID> ECAP0001+0x30.w
>> 0000
>>
>> # Inject one error
>> $ echo 1 > error_inject
>>
>> # Interrupt received
>> pcieport <Dev ID>: AER: Root Error Status 0001
>>
>> # Inject another error (within 5 seconds)
>> $ echo 1 > error_inject
>>
>> # No interrupt received, but "multiple ERR_COR" is now set
>> $ setpci -s <Dev ID> ECAP0001+0x30.w
>> 0003
>>
>> # Wait for a while, then clear ERR_COR. A new interrupt immediately
>> fires.
>> $ setpci -s <Dev ID> ECAP0001+0x30.w=0x1
>> pcieport <Dev ID>: AER: Root Error Status 0002
>>
>> Currently, the above issue has been only reproduced in the ICL server
>> platform.
>>
>> [Eric: proposed reproducing steps]
> Hmm, this differs from the procedure I described on v1, and I don't
> think will work as described here.
I have attempted to modify the steps to reproduce it without returning
IRQ_NONE for all cases (which will break the functionality). But I
think I did not correct the last few steps.
How about replacing the last 3 steps with following?
# Inject another error (within 5 seconds)
$ echo 1 > error_inject
# You will get a new IRQ with only multiple ERR_COR bit set
pcieport <Dev ID>: AER: Root Error Status 0002
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
More information about the Linuxppc-dev
mailing list