[PATCH V9 03/18] PCI: Add weak pcibios_iov_resource_size() interface
Bjorn Helgaas
bhelgaas at google.com
Wed Nov 19 12:12:43 AEDT 2014
[+cc Don, Myron]
Hi Wei,
On Sun, Nov 02, 2014 at 11:41:19PM +0800, Wei Yang wrote:
> When retrieving VF IOV BAR in virtfn_add(), it will divide the total PF's IOV
> BAR size with the totalVF number. This is true for most cases, while may not
> be correct on some specific platform.
>
> For example on PowerNV platform, in order to fix PF's IOV BAR into a hardware
> alignment, the PF's IOV BAR size would be expended. This means the original
> method couldn't work.
>
> This patch introduces a weak pcibios_iov_resource_size() interface, which
> gives platform a chance to implement specific method to calculate the VF BAR
> resource size.
>
> Signed-off-by: Wei Yang <weiyang at linux.vnet.ibm.com>
> ---
> drivers/pci/iov.c | 27 +++++++++++++++++++++++++--
> include/linux/pci.h | 5 +++++
> 2 files changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 4d1685d..6866830 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -61,6 +61,30 @@ static void virtfn_remove_bus(struct pci_bus *physbus, struct pci_bus *virtbus)
> pci_remove_bus(virtbus);
> }
>
> +resource_size_t __weak pcibios_iov_resource_size(struct pci_dev *dev, int resno)
> +{
> + return 0;
> +}
> +
> +resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> +{
> + resource_size_t size;
> + struct pci_sriov *iov;
> +
> + if (!dev->is_physfn)
> + return 0;
> +
> + size = pcibios_iov_resource_size(dev, resno);
> + if (size != 0)
> + return size;
> +
> + iov = dev->sriov;
> + size = resource_size(dev->resource + resno);
> + do_div(size, iov->total_VFs);
> +
> + return size;
> +}
> +
> static int virtfn_add(struct pci_dev *dev, int id, int reset)
> {
> int i;
> @@ -96,8 +120,7 @@ static int virtfn_add(struct pci_dev *dev, int id, int reset)
> continue;
> virtfn->resource[i].name = pci_name(virtfn);
> virtfn->resource[i].flags = res->flags;
> - size = resource_size(res);
> - do_div(size, iov->total_VFs);
> + size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES);
> virtfn->resource[i].start = res->start + size * id;
Can you help me understand this?
We have previously called sriov_init() on the PF. There, we sized the VF
BARs, which are in the PF's SR-IOV Capability (SR-IOV spec sec 3.3.14).
The size we discover is the amount of space required by a single VF, so
sriov_init() adjusts PF->resource[PCI_IOV_RESOURCES + i] by multiplying
that size by PCI_SRIOV_TOTAL_VF, so this PF resource is now big enough to
hold the VF BAR[i] areas for all the possible VFs.
Now we're in virtfn_add(), setting up a new VF. The usual BARs at config
space 0x10, etc., are read-only zeroes for a VF, so we don't size them the
usual way. Instead, we carve out a slice of the
PF->resource[PCI_IOV_RESOURCES + i] area.
I thought the starting address of the VF BARn memory aperture was
prescribed by the spec in sec 2.1.1.1 and shown in figure 2-1:
BARx VFy starting address = VF BARx + (y - 1) * (VF BARx aperture size)
That's basically what the existing code does. We don't save the VF BARx
aperture size, so we recompute it by undoing the multiplication we did in
sriov_init().
But you're computing the starting address using a different VF BARx
aperture size. How does that work? I assumed this calculation was built
into the PCI device, but obviously I'm missing something.
To make it concrete, here's a made-up example:
PF SR-IOV Capability
TotalVFs = 4
NumVFs = 4
System Page Size = 4KB
VF BAR0 = [mem 0x00000000-0x00000fff] (4KB at address 0)
PF pci_dev->resource[7] = [mem 0x00000000-0x00003fff] (16KB)
VF1 pci_dev->resource[0] = [mem 0x00000000-0x00000fff]
VF2 pci_dev->resource[0] = [mem 0x00001000-0x00001fff]
VF3 pci_dev->resource[0] = [mem 0x00002000-0x00002fff]
VF4 pci_dev->resource[0] = [mem 0x00003000-0x00003fff]
You're changing this so we might use 16KB as the VF BAR0 aperture size
instead of 4KB, which would result in the following:
VF1 pci_dev->resource[0] = [mem 0x00000000-0x00003fff]
VF2 pci_dev->resource[0] = [mem 0x00004000-0x00007fff]
VF3 pci_dev->resource[0] = [mem 0x00008000-0x0000bfff]
VF4 pci_dev->resource[0] = [mem 0x0000c000-0x0000ffff]
But you didn't change sriov_init(), so the PF resource[7] is only 16KB, not
the 64KB the VFs need. And I assume the VF address decoder is wired to
claim the 4KB regions at 0x0000, 0x1000, 0x2000, 0x3000, not the ones at
0x0000, 0x4000, 0x8000, 0xc000.
Bjorn
> virtfn->resource[i].end = virtfn->resource[i].start + size - 1;
> rc = request_resource(res, &virtfn->resource[i]);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index bbf8058..2f5b454 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1162,6 +1162,8 @@ resource_size_t pcibios_window_alignment(struct pci_bus *bus,
> resource_size_t pcibios_iov_resource_alignment(struct pci_dev *dev,
> int resno,
> resource_size_t align);
> +resource_size_t pcibios_iov_resource_size(struct pci_dev *dev,
> + int resno);
>
> #define PCI_VGA_STATE_CHANGE_BRIDGE (1 << 0)
> #define PCI_VGA_STATE_CHANGE_DECODES (1 << 1)
> @@ -1666,6 +1668,7 @@ int pci_num_vf(struct pci_dev *dev);
> int pci_vfs_assigned(struct pci_dev *dev);
> int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs);
> int pci_sriov_get_totalvfs(struct pci_dev *dev);
> +resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno);
> #else
> static inline int pci_iov_virtfn_bus(struct pci_dev *dev, int id)
> {
> @@ -1685,6 +1688,8 @@ static inline int pci_sriov_set_totalvfs(struct pci_dev *dev, u16 numvfs)
> { return 0; }
> static inline int pci_sriov_get_totalvfs(struct pci_dev *dev)
> { return 0; }
> +static inline resource_size_t pci_iov_resource_size(struct pci_dev *dev, int resno)
> +{ return 0; }
> #endif
>
> #if defined(CONFIG_HOTPLUG_PCI) || defined(CONFIG_HOTPLUG_PCI_MODULE)
> --
> 1.7.9.5
>
More information about the Linuxppc-dev
mailing list