[PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device

Gavin Shan gwshan at linux.vnet.ibm.com
Sat May 24 12:06:21 EST 2014

On Fri, May 23, 2014 at 08:29:59AM -0600, Alex Williamson wrote:
>On Fri, 2014-05-23 at 14:37 +1000, Gavin Shan wrote:
>> On Thu, May 22, 2014 at 09:10:53PM -0600, Alex Williamson wrote:
>> >On Thu, 2014-05-22 at 18:23 +1000, Gavin Shan wrote:


>No, sorry, I mean how does the user get information about the error?
>The interface we have here is:
>a) find that something bad has happened
>b) kick it into working again
>c) continue
>How does the user figure out what happened and if it makes sense to
>attempt to recover?  Where does the user learn that their disk is on

When 0xFF's returned from config or IO read, user should check the
device (PE)'s state with ioctl command VFIO_EEH_PE_GET_STATE. If the
device (PE) has been put into "frozen" state, It's confirmed the device
("disk" you mentioned) is on fire. User should kick off recovery, which

- User stops any operatins (config, IO, DMA) on the device because any
  PCI traffic to "frozen" device will be dropped from software or hardware
  level. Also, we don't expect DMA traffic during recovery. Otherwise,
  we will bump into recursive errors and the recovery should fail.
- VFIO_EEH_PE_SET_OPTION to enable I/O path ("DMA" path is still under frozen
  state). EEH_VFIO_PE_CONFIGURE to reconfigure affected PCI bridges and then
  do error log retrieval.
- VFIO_EEH_PE_RESET to reset the affected device (PE). EEH_VFIO_PE_CONFIUGRE
  to restore BARs.
- User resumes the device to start PCI traffic and device is brought to
  funtional state.


>No, I prefer to stay consistent with the rest of the VFIO API and use
>argsz + flags.

Here's the recap for previous reply: I have several cases for ioctl().

- ioctl(fd, cmd, NULL):   I needn't any input info.
- ioctl(fd, cmd, &data):  I need input info

For all the cases, should I simply have a data struct to include "argsz+flags"?

For return value from ioctl(), can we simply to have additional field in the
above data struct to carry it? "0" is the information I have to return for
some of the cases.


>As agraf noted, I'm asking why reset and configure are separate when
>they seem to be used together.

Ok. It's the recap: they're 2 separate steps of error recovery as
defined in PAPR spec. Also, they correspond to 2 separate RTAS calls.
So I don't think we can put them together.


More information about the Linuxppc-dev mailing list