[PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

Alexey Kardashevskiy aik at ozlabs.ru
Fri May 6 16:35:38 AEST 2016

On 05/06/2016 01:05 AM, Alex Williamson wrote:
> On Thu, 5 May 2016 12:15:46 +0000
> "Tian, Kevin" <kevin.tian at intel.com> wrote:
>>> From: Yongji Xie [mailto:xyjxie at linux.vnet.ibm.com]
>>> Sent: Thursday, May 05, 2016 7:43 PM
>>> Hi David and Kevin,
>>> On 2016/5/5 17:54, David Laight wrote:
>>>> From: Tian, Kevin
>>>>> Sent: 05 May 2016 10:37
>>>> ...
>>>>>> Acutually, we are not aimed at accessing MSI-X table from
>>>>>> guest. So I think it's safe to passthrough MSI-X table if we
>>>>>> can make sure guest kernel would not touch MSI-X table in
>>>>>> normal code path such as para-virtualized guest kernel on PPC64.
>>>>> Then how do you prevent malicious guest kernel accessing it?
>>>> Or a malicious guest driver for an ethernet card setting up
>>>> the receive buffer ring to contain a single word entry that
>>>> contains the address associated with an MSI-X interrupt and
>>>> then using a loopback mode to cause a specific packet be
>>>> received that writes the required word through that address.
>>>> Remember the PCIe cycle for an interrupt is a normal memory write
>>>> cycle.
>>>> 	David
>>> If we have enough permission to load a malicious driver or
>>> kernel, we can easily break the guest without exposed
>>> MSI-X table.
>>> I think it should be safe to expose MSI-X table if we can
>>> make sure that malicious guest driver/kernel can't use
>>> the MSI-X table to break other guest or host. The
>>> capability of IRQ remapping could provide this
>>> kind of protection.
>> With IRQ remapping it doesn't mean you can pass through MSI-X
>> structure to guest. I know actual IRQ remapping might be platform
>> specific, but at least for Intel VT-d specification, MSI-X entry must
>> be configured with a remappable format by host kernel which
>> contains an index into IRQ remapping table. The index will find a
>> IRQ remapping entry which controls interrupt routing for a specific
>> device. If you allow a malicious program random index into MSI-X
>> entry of assigned device, the hole is obvious...
>> Above might make sense only for a IRQ remapping implementation
>> which doesn't rely on extended MSI-X format (e.g. simply based on
>> BDF). If that's the case for PPC, then you should build MSI-X
>> passthrough based on this fact instead of general IRQ remapping
>> enabled or not.
> I don't think anyone is expecting that we can expose the MSI-X vector
> table to the guest and the guest can make direct use of it.  The end
> goal here is that the guest on a power system is already
> paravirtualized to not program the device MSI-X by directly writing to
> the MSI-X vector table.  They have hypercalls for this since they
> always run virtualized.  Therefore a) they never intend to touch the
> MSI-X vector table and b) they have sufficient isolation that a guest
> can only hurt itself by doing so.
> On x86 we don't have a), our method of programming the MSI-X vector
> table is to directly write to it. Therefore we will always require QEMU
> to place a MemoryRegion over the vector table to intercept those
> accesses.  However with interrupt remapping, we do have b) on x86, which
> means that we don't need to be so strict in disallowing user accesses
> to the MSI-X vector table.  It's not useful for configuring MSI-X on
> the device, but the user should only be able to hurt themselves by
> writing it directly.  x86 doesn't really get anything out of this
> change, but it helps this special case on power pretty significantly
> aiui.  Thanks,

Excellent short overview, saved :)

How do we proceed with these patches? Nobody seems objecting them but also 
nobody seems taking them either...


More information about the Linuxppc-dev mailing list