[PATCH kernel 0/3 REPOST] vfio-pci: Add support for mmapping MSI-X table

Alexey Kardashevskiy aik at ozlabs.ru
Fri Jun 23 15:06:37 AEST 2017


On 23/06/17 07:11, Alex Williamson wrote:
> On Thu, 15 Jun 2017 15:48:42 +1000
> Alexey Kardashevskiy <aik at ozlabs.ru> wrote:
> 
>> Here is a patchset which Yongji was working on before
>> leaving IBM LTC. Since we still want to have this functionality
>> in the kernel (DPDK is the first user), here is a rebase
>> on the current upstream.
>>
>>
>> Current vfio-pci implementation disallows to mmap the page
>> containing MSI-X table in case that users can write directly
>> to MSI-X table and generate an incorrect MSIs.
>>
>> However, this will cause some performance issue when there
>> are some critical device registers in the same page as the
>> MSI-X table. We have to handle the mmio access to these
>> registers in QEMU emulation rather than in guest.
>>
>> To solve this issue, this series allows to expose MSI-X table
>> to userspace when hardware enables the capability of interrupt
>> remapping which can ensure that a given PCI device can only
>> shoot the MSIs assigned for it. And we introduce a new bus_flags
>> PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
>> for different archs.
>>
>> The patch 3 are based on the proposed patchset[1].
>>
>> Changelog
>> v3:
>> - rebased on the current upstream
> 
> There's something not forthcoming here, the last version I see from
> Yongji is this one:
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2016-June/017245.html
> 
> Which was a 6-patch series where patches 2-4 tried to apply
> PCI_BUS_FLAGS_MSI_REMAP for cases that supported other platforms.  That
> doesn't exist here, so it's not simply a rebase.  Patch 1/ seems to
> equate this new flag to the IOMMU capability IOMMU_CAP_INTR_REMAP, but
> nothing is done here to match them together.  That patch also mentions
> the work Eric has done for similar features on ARM, but again those
> patches are dropped.  It seems like an incomplete feature now.  Thanks,


Thanks! I suspected this is not the latest but could not find anything
better than we use internally for tests, and I could not reach Yongji for
comments whether this was the latest update.

As I am reading the patches, I notice that the "msi remap" term is used all
over the place. While this remapping capability may be the case for x86/arm
(and therefore the IOMMU_CAP_INTR_REMAP flag makes sense), powernv does not
do remapping but provides hardware isolation. When we are allowing MSIX BAR
mapping to the userspace - the isolation is what we really care about. Will
it make sense to rename PCI_BUS_FLAGS_MSI_REMAP to
PCI_BUS_FLAGS_MSI_ISOLATED ?

Another thing - the patchset enables PCI_BUS_FLAGS_MSI_REMAP when IOMMU
just advertises IOMMU_CAP_INTR_REMAP, not necessarily uses it, should the
patchset actually look at something like irq_remapping_enabled in
drivers/iommu/amd_iommu.c instead?



> 
> Alex
> 
>> v2:
>> - Make the commit log more clear
>> - Replace pci_bus_check_msi_remapping() with pci_bus_msi_isolated()
>>   so that we could clearly know what the function does
>> - Set PCI_BUS_FLAGS_MSI_REMAP in pci_create_root_bus() instead
>>   of iommu_bus_notifier()
>> - Reserve VFIO_REGION_INFO_FLAG_CAPS when we allow to mmap MSI-X
>>   table so that we can know whether we allow to mmap MSI-X table
>>   in QEMU
>>
>> [1] https://www.mail-archive.com/linux-kernel%40vger.kernel.org/msg1138820.html
>>
>>
>> This is based on sha1
>> 63f700aab4c1 Linus Torvalds "Merge tag 'xtensa-20170612' of git://github.com/jcmvbkbc/linux-xtensa".
>>
>> Please comment. Thanks.
>>
>>
>>
>> Yongji Xie (3):
>>   PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag
>>   pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge
>>   vfio-pci: Allow to expose MSI-X table to userspace if interrupt
>>     remapping is enabled
>>
>>  include/linux/pci.h                       |  1 +
>>  arch/powerpc/platforms/powernv/pci-ioda.c |  8 ++++++++
>>  drivers/vfio/pci/vfio_pci.c               | 18 +++++++++++++++---
>>  drivers/vfio/pci/vfio_pci_rdwr.c          |  3 ++-
>>  4 files changed, 26 insertions(+), 4 deletions(-)
>>
> 


-- 
Alexey


More information about the Linuxppc-dev mailing list