Spurious interrupts and PCIE_PME

Shakeeb B K shakeebbk at gmail.com
Tue Apr 26 17:30:10 AEST 2022


Hello All,

We have a system with a pcie interface between BMC(ast2600) as RC and
an FPGA as EP.
And by design we use IntX legacy interrupt for signalling as part of
the communication protocol.

On OpenBMC, we have an user space application using vfio_pci drivers
for the device access
and we make use of vfio interrupts for the IntX.
We also notice that pcie_pme drivers are sharing the IntX line for pme
events as well.

In this case, we are hitting a peculiar use case where - if the user
space application terminates abruptly(e.g. SIGKILL),
without the cleanup - we leave the protocol in such a state that the
EP keeps sending IntX while the vfio handler is dead.
Now, since pcie_pme is also sharing the same interrupt line, it keeps
getting these spurious interrupts while nobody handles the interrupt.
These spurious interrupts eventually hog the pcie_pme handler and lead
to soft lock up and kernel panics.

Though we understand we could handle vfio device in a better and more
robust way,
we would like to learn from past experiences - if anybody has hit such
issues before?
What's been the experience with pcie_pme sharing IntX with any other
application?
Any inputs woud be greatly appreciated.

Thanks,
Shakeeb


More information about the openbmc mailing list