[PATCH v7 3/3] drivers/vfio: EEH support for VFIO PCI device

Alexander Graf agraf at suse.de
Wed May 28 23:12:35 EST 2014


On 28.05.14 14:49, Gavin Shan wrote:
> On Wed, May 28, 2014 at 01:41:35PM +0200, Alexander Graf wrote:
>> On 28.05.14 02:55, Gavin Shan wrote:
>>> On Tue, May 27, 2014 at 12:15:27PM -0600, Alex Williamson wrote:
>>>> On Tue, 2014-05-27 at 18:40 +1000, Gavin Shan wrote:
>>>>> The patch adds new IOCTL commands for sPAPR VFIO container device
>>>>> to support EEH functionality for PCI devices, which have been passed
>>>>> through from host to somebody else via VFIO.
>>>>>
>>>>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>>>>> ---
>>>>>   Documentation/vfio.txt              | 92 ++++++++++++++++++++++++++++++++++++-
>>>>>   drivers/vfio/pci/Makefile           |  1 +
>>>>>   drivers/vfio/pci/vfio_pci.c         | 20 +++++---
>>>>>   drivers/vfio/pci/vfio_pci_eeh.c     | 46 +++++++++++++++++++
>>>>>   drivers/vfio/pci/vfio_pci_private.h |  5 ++
>>>>>   drivers/vfio/vfio_iommu_spapr_tce.c | 85 ++++++++++++++++++++++++++++++++++
>>>>>   include/uapi/linux/vfio.h           | 66 ++++++++++++++++++++++++++
>>>>>   7 files changed, 308 insertions(+), 7 deletions(-)
>>>>>   create mode 100644 drivers/vfio/pci/vfio_pci_eeh.c
>> [...]
>>
>>>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>>>>> index cb9023d..c5fac36 100644
>>>>> --- a/include/uapi/linux/vfio.h
>>>>> +++ b/include/uapi/linux/vfio.h
>>>>> @@ -455,6 +455,72 @@ struct vfio_iommu_spapr_tce_info {
>>>>>   #define VFIO_IOMMU_SPAPR_TCE_GET_INFO	_IO(VFIO_TYPE, VFIO_BASE + 12)
>>>>> +/*
>>>>> + * EEH functionality can be enabled or disabled on one specific device.
>>>>> + * Also, the DMA or IO frozen state can be removed from the frozen PE
>>>>> + * if required.
>>>>> + */
>>>>> +struct vfio_eeh_pe_set_option {
>>>>> +	__u32 argsz;
>>>>> +	__u32 flags;
>>>>> +	__u32 option;
>>>>> +#define VFIO_EEH_PE_SET_OPT_DISABLE	0	/* Disable EEH	*/
>>>>> +#define VFIO_EEH_PE_SET_OPT_ENABLE	1	/* Enable EEH	*/
>>>>> +#define VFIO_EEH_PE_SET_OPT_IO		2	/* Enable IO	*/
>>>>> +#define VFIO_EEH_PE_SET_OPT_DMA		3	/* Enable DMA	*/
>>>> This is more of a "command" than an "option" isn't it?  Each of these
>>>> probably needs a more significant description.
>>>>
>>> Yeah, it would be regarded as "opcode" and I'll add more description about
>>> them in next revision.
>> Please just call them commands.
>>
> Ok. I guess you want me to change the macro names like this ?
>
> #define VFIO_EEH_CMD_DISABLE	0	/* Disable EEH functionality	*/
> #define VFIO_EEH_CMD_ENABLE	1	/* Enable EEH functionality	*/
> #define VFIO_EEH_CMD_ENABLE_IO	2	/* Enable IO for frozen PE	*/
> #define VFIO_EEH_CMD_ENABLE_DMA	3	/* Enable DMA for frozen PE	*/

Yes, the ioctl name too.

>
>>>>> +};
>>>>> +
>>>>> +#define VFIO_EEH_PE_SET_OPTION		_IO(VFIO_TYPE, VFIO_BASE + 21)
>>>>> +
>>>>> +/*
>>>>> + * Each EEH PE should have unique address to be identified. PE's
>>>>> + * sharing mode is also useful information as well.
>>>>> + */
>>>>> +#define VFIO_EEH_PE_GET_ADDRESS		0	/* Get address	*/
>>>>> +#define VFIO_EEH_PE_GET_MODE		1	/* Query mode	*/
>>>>> +#define VFIO_EEH_PE_MODE_NONE		0	/* Not a PE	*/
>>>>> +#define VFIO_EEH_PE_MODE_NOT_SHARED	1	/* Exclusive	*/
>>>>> +#define VFIO_EEH_PE_MODE_SHARED		2	/* Shared mode	*/
>>>>> +
>>>>> +/*
>>>>> + * EEH PE might have been frozen because of PCI errors. Also, it might
>>>>> + * be experiencing reset for error revoery. The following command helps
>>>>> + * to get the state.
>>>>> + */
>>>>> +struct vfio_eeh_pe_get_state {
>>>>> +	__u32 argsz;
>>>>> +	__u32 flags;
>>>>> +	__u32 state;
>>>>> +};
>>>> Should state be a union to better describe the value returned?  What
>>>> exactly is the address and why does the user need to know it?  Does this
>>>> need user input or could we just return the address and mode regardless?
>>>>
>>> Ok. I think you want enum (not union) for state. I'll have macros for the
>>> state in next revision as I did that for other cases.
>>>
>>> Those macros defined for "address" just for ABI stuff as Alex.G mentioned.
>>> There isn't corresponding ioctl command for host to get address any more
>>> because QEMU (user) will have to figure it out by himself. The "address"
>>> here means PE address and user has to figure it out according to PE
>>> segmentation.
>> Why would the user ever need the address?
>>
> I will remove those "address" related macros in next revision because it's
> user-level bussiness, not related to host kernel any more.

Ok, so the next question is whether there will be any state outside of 
GET_MODE left in the future.


Alex

> If the user is QEMU + guest, we need the address to identify the PE though PHB
> BUID could be used as same purpose to get PHB, which is one-to-one mapping with
> IOMMU group on sPAPR platform. However, once the PE address is built and returned
> to guest, guest will use the PE address as input parameter in subsequent RTAS
> calls.
>
> If the user is some kind of application who just uses the ioctl() without supporting
> RTAS calls. We don't need care about PE address.
>
> Thanks,
> Gavin
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the Linuxppc-dev mailing list