[PATCH v6 2/3] drivers/vfio: EEH support for VFIO PCI device
Alex Williamson
alex.williamson at redhat.com
Wed May 28 03:39:54 EST 2014
On Sat, 2014-05-24 at 12:06 +1000, Gavin Shan wrote:
> On Fri, May 23, 2014 at 08:29:59AM -0600, Alex Williamson wrote:
> >On Fri, 2014-05-23 at 14:37 +1000, Gavin Shan wrote:
> >> On Thu, May 22, 2014 at 09:10:53PM -0600, Alex Williamson wrote:
> >> >On Thu, 2014-05-22 at 18:23 +1000, Gavin Shan wrote:
>
> .../...
>
> >No, sorry, I mean how does the user get information about the error?
> >The interface we have here is:
> >a) find that something bad has happened
> >b) kick it into working again
> >c) continue
> >
> >How does the user figure out what happened and if it makes sense to
> >attempt to recover? Where does the user learn that their disk is on
> >fire?
> >
>
> When 0xFF's returned from config or IO read, user should check the
> device (PE)'s state with ioctl command VFIO_EEH_PE_GET_STATE. If the
> device (PE) has been put into "frozen" state, It's confirmed the device
> ("disk" you mentioned) is on fire.
No, this only confirms that something bad happened, not _what_ bad thing
happened.
> User should kick off recovery, which
> includes:
And here you're just describing the kick operation again...
>
> - User stops any operatins (config, IO, DMA) on the device because any
> PCI traffic to "frozen" device will be dropped from software or hardware
> level. Also, we don't expect DMA traffic during recovery. Otherwise,
> we will bump into recursive errors and the recovery should fail.
> - VFIO_EEH_PE_SET_OPTION to enable I/O path ("DMA" path is still under frozen
> state). EEH_VFIO_PE_CONFIGURE to reconfigure affected PCI bridges and then
> do error log retrieval.
These logs, where do they go? How does the user get access? That's
what I'm trying to ask about.
> - VFIO_EEH_PE_RESET to reset the affected device (PE). EEH_VFIO_PE_CONFIUGRE
> to restore BARs.
> - User resumes the device to start PCI traffic and device is brought to
> funtional state.
>
> .../...
>
> >
> >No, I prefer to stay consistent with the rest of the VFIO API and use
> >argsz + flags.
> >
>
> Here's the recap for previous reply: I have several cases for ioctl().
>
> - ioctl(fd, cmd, NULL): I needn't any input info.
> - ioctl(fd, cmd, &data): I need input info
>
> For all the cases, should I simply have a data struct to include "argsz+flags"?
Anything that requires data should have argsz+flags, if it doesn't
require data, it doesn't need them, but think long an hard about whether
there's any possibility that we'll need parameters in the future.
> For return value from ioctl(), can we simply to have additional field in the
> above data struct to carry it? "0" is the information I have to return for
> some of the cases.
If for instance your ioctl is returning something like "number of
errors", then it's perfectly fine to use that as the ioctl return. <0
is error, >= zero is a success with value.
More information about the Linuxppc-dev
mailing list