[PATCH v3 1/2] PCI: One more parameter to pci_set_pcie_reset_state()

Benjamin Herrenschmidt benh at kernel.crashing.org
Tue Mar 24 11:23:55 AEDT 2015

On Mon, 2015-03-23 at 15:08 -0600, Alex Williamson wrote:
> I don't know what you're doing on POWER, I thought groups were
> equivalent to PEs, but on x86 we learn about isolation of PCI functions
> by standard PCI properties.  Devices need to tell us that they're
> isolated via ACS capabilities (or quirks, with vendor approval).
> Otherwise we assume there is no isolation and multi-function devices
> will be grouped together, preventing the individual functions from being
> assigned to separate userspace instances.  VFIO does not ignore the
> problem of internal communication between functions, it uses ACS... at
> least on most platforms.

POWER doesn't allow functions in different PEs today (mostly for a
different reason which is the problem we have with MMIO assignment to
different PEs, though we are working on ways to fix that in the future
in which case we might start using ACS).

There's still a potential issue with LSIs though, I haven't looked how
you handle it on x86.

> Firmware shared between functions is an unsolvable problem outside of
> restricting device assignment only to SR-IOV virtual functions.  We
> cannot know the programming method to block firmware updates of an
> assigned device nor can we control which functions share the firmware.
> If you're worried about this level of attack against the hardware, use
> VFs exclusively.

You cannot trust HW :-)

> If you're unsatisfied with the grouping of devices on your platform,
> this is entirely within your control since VFIO relies on IOMMU groups,
> which are managed via the platform IOMMU driver.  If your IOMMU driver
> is arbitrarily splitting each PCI function into a separate group without
> testing the isolation, that's a problem with IOMMU groups on your
> platform.  If you would like to add a reset requirement to a group, you
> have the ability to do that.  On x86, I expect this would be far too
> restrictive.

That means that you cannot recover devices that don't support FLR...

> VFIO also knows how to do bus and slot resets, the pci_reset_bus() and
> pci_reset_slot() interfaces were added by me with vfio-pci as the first
> user.  Resets are generally handled as best effort though and this is
> sometimes a property of a device that might make it unsuitable for
> device assignment.

Or unsuitable as a multi-function split assignment, we can still give
the whole slot to the guest.

> The fact that you're implying above that the devices covered by the
> device specific reset proposed in this series can be assigned separately
> makes ignoring the scope of pci_reset_function() all the more wrong.  If
> you want to change grouping on POWER to require that a group can be
> reset, that's an IOMMU driver issue.  If you want to make a change to
> VFIO to require an opt-in/out of allowing access to groups without a
> reset, then propose something.  Just please don't ignore the semantics
> of existing functions because it's a quicker and easier hack than doing
> it correctly.  Thanks,

The IOMMU driver unfortunately cannot know whether the devices support a
good reset granularity without quirks ... Oh well. It's a mess, thanks
for Intel for not making FLR mandatory since PCI 1.0 .... :-)


More information about the Linuxppc-dev mailing list