[PATCH v2 2/2] PCI: Don't disable PF's memory decoding when enabling SRIOV

Gavin Shan gwshan at linux.vnet.ibm.com
Mon Oct 24 10:28:02 AEDT 2016


On Fri, Oct 21, 2016 at 11:50:34AM -0500, Bjorn Helgaas wrote:
>Hi Gavin,
>
>On Fri, Sep 30, 2016 at 09:47:50AM +1000, Gavin Shan wrote:
>> pci_update_resource() might be called to update (shift) IOV BARs
>> in PPC PowerNV specific pcibios_sriov_enable() when enabling PF's
>> SRIOV capability. At that point, the PF have been functional if
>> the SRIOV is enabled through sysfs entry "sriov_numvfs". The PF's
>> memory decoding (0x2 in PCI_COMMAND) shouldn't be disabled when
>> updating its IOV BARs with pci_update_resource(). Otherwise, we
>> receives EEH error caused by MMIO access to PF's memory BARs during
>> the window when PF's memory decoding is disabled.
>
>The fact that you get EEH errors is irrelevant.  We can't write code
>simply to avoid errors -- we have to write code to make the system
>work correctly.
>
>I do not want to add a "mmio_force_on" parameter to
>pci_update_resource().  That puts the burden on the caller to
>understand this subtle issue.  If the caller passes mmio_force_on=1
>when it shouldn't, things will almost always work, but once in a blue
>moon a half-updated BAR will conflict with some other device in the
>system, and we'll have an unreproducible, undebuggable crash.
>

Bjorn, thanks for your comments. Yes, the EEH error was caused by MMIO
access to PF's normal BARs, not VF BARs. Yeah, I also had the conern
that adding parameter to pci_update_resource() increases the visible
complexity to the caller of the function.

>What you do need is an explanation of why it's safe to non-atomically
>update a VF BARx in the SR-IOV capability.  I think this probably
>involves the VF MSE bit, and you probably have to either disable VFs
>completely or clear the MSE bit while you're updating the VF BARx.  We
>should be able to do this inside pci_update_resource() without
>changing the interface.
>

Yes, It's what PATCH[1/2] does: (PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE)
are set after VF BARs are updated with pci_update_resource() in this PPC
specific scenario. There are other two situations where the IOV BARs are
updated: PCI resource resizing and allocation during system booting or hot
adding PF. The VF shouldn't be enabled in both cases when updating IOV BARs.

I think you suggest to add check in pci_update_resource(): Don't disable
PF's memory decoding when updating VF BARs. I will send updated revision
if it's what you want.

	/*
	 * The PF might be functional when updating its IOV BARs. So PF's
	 * memory decoding shouldn't be disabled when updating its IOV BARs.
	 */
	disable = (res->flags & IORESOURCE_MEM_64) && !dev->mmio_always_on;
#ifdef CONFIG_PCI_IOV
	disable &= !(resno >= PCI_IOV_RESOURCES &&
		     resno <= PCI_IOV_RESOURCE_END);
#endif
	if (disable) {
		pci_read_config_word(dev, PCI_COMMAND, &cmd);
		pci_write_config_word(dev, PCI_COMMAND,
				      cmd & ~PCI_COMMAND_MEMORY);
	}

Thanks,
Gavin

>>    sriov_numvfs_store
>>    pdev->driver->sriov_configure
>>    mlx5_core_sriov_configure
>>    pci_enable_sriov
>>    sriov_enable
>>    pcibios_sriov_enable
>>    pnv_pci_sriov_enable
>>    pnv_pci_vf_resource_shift
>>    pci_update_resource
>> 
>> This doesn't change the PF's memory decoding in the path by introducing
>> additional parameter (@mmio_force_on) to pci_update_resource() in
>> the above path.
>
>> 
>> Reported-by: Carol Soto <clsoto at us.ibm.com>
>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>> Tested-by: Carol Soto <clsoto at us.ibm.com>
>> ---
>>  arch/powerpc/platforms/powernv/pci-ioda.c | 2 +-
>>  drivers/pci/iov.c                         | 2 +-
>>  drivers/pci/pci.c                         | 2 +-
>>  drivers/pci/setup-res.c                   | 9 +++++----
>>  include/linux/pci.h                       | 2 +-
>>  5 files changed, 9 insertions(+), 8 deletions(-)
>> 
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index 38a5c65..f4ccc5b 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -1006,7 +1006,7 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset)
>>  		dev_info(&dev->dev, "VF BAR%d: %pR shifted to %pR (%sabling %d VFs shifted by %d)\n",
>>  			 i, &res2, res, (offset > 0) ? "En" : "Dis",
>>  			 num_vfs, offset);
>> -		pci_update_resource(dev, i + PCI_IOV_RESOURCES);
>> +		pci_update_resource(dev, i + PCI_IOV_RESOURCES, true);
>>  	}
>>  	return 0;
>>  }
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index f1343f0..db31966 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -511,7 +511,7 @@ static void sriov_restore_state(struct pci_dev *dev)
>>  		return;
>>  
>>  	for (i = PCI_IOV_RESOURCES; i <= PCI_IOV_RESOURCE_END; i++)
>> -		pci_update_resource(dev, i);
>> +		pci_update_resource(dev, i, false);
>>  
>>  	pci_write_config_dword(dev, iov->pos + PCI_SRIOV_SYS_PGSIZE, iov->pgsz);
>>  	pci_iov_set_numvfs(dev, iov->num_VFs);
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index aab9d51..87a33c0 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -545,7 +545,7 @@ static void pci_restore_bars(struct pci_dev *dev)
>>  		return;
>>  
>>  	for (i = 0; i < PCI_BRIDGE_RESOURCES; i++)
>> -		pci_update_resource(dev, i);
>> +		pci_update_resource(dev, i, false);
>>  }
>>  
>>  static const struct pci_platform_pm_ops *pci_platform_pm;
>> diff --git a/drivers/pci/setup-res.c b/drivers/pci/setup-res.c
>> index 66c4d8f..e8a50ff 100644
>> --- a/drivers/pci/setup-res.c
>> +++ b/drivers/pci/setup-res.c
>> @@ -26,7 +26,7 @@
>>  #include "pci.h"
>>  
>>  
>> -void pci_update_resource(struct pci_dev *dev, int resno)
>> +void pci_update_resource(struct pci_dev *dev, int resno, bool mmio_force_on)
>>  {
>>  	struct pci_bus_region region;
>>  	bool disable;
>> @@ -81,7 +81,8 @@ void pci_update_resource(struct pci_dev *dev, int resno)
>>  	 * disable decoding so that a half-updated BAR won't conflict
>>  	 * with another device.
>>  	 */
>> -	disable = (res->flags & IORESOURCE_MEM_64) && !dev->mmio_always_on;
>> +	disable = (res->flags & IORESOURCE_MEM_64) &&
>> +		  !mmio_force_on && !dev->mmio_always_on;
>>  	if (disable) {
>>  		pci_read_config_word(dev, PCI_COMMAND, &cmd);
>>  		pci_write_config_word(dev, PCI_COMMAND,
>> @@ -310,7 +311,7 @@ int pci_assign_resource(struct pci_dev *dev, int resno)
>>  	res->flags &= ~IORESOURCE_STARTALIGN;
>>  	dev_info(&dev->dev, "BAR %d: assigned %pR\n", resno, res);
>>  	if (resno < PCI_BRIDGE_RESOURCES)
>> -		pci_update_resource(dev, resno);
>> +		pci_update_resource(dev, resno, false);
>>  
>>  	return 0;
>>  }
>> @@ -350,7 +351,7 @@ int pci_reassign_resource(struct pci_dev *dev, int resno, resource_size_t addsiz
>>  	dev_info(&dev->dev, "BAR %d: reassigned %pR (expanded by %#llx)\n",
>>  		 resno, res, (unsigned long long) addsize);
>>  	if (resno < PCI_BRIDGE_RESOURCES)
>> -		pci_update_resource(dev, resno);
>> +		pci_update_resource(dev, resno, false);
>>  
>>  	return 0;
>>  }
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 0ab8359..99231d1 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -1039,7 +1039,7 @@ int pci_try_reset_bus(struct pci_bus *bus);
>>  void pci_reset_secondary_bus(struct pci_dev *dev);
>>  void pcibios_reset_secondary_bus(struct pci_dev *dev);
>>  void pci_reset_bridge_secondary_bus(struct pci_dev *dev);
>> -void pci_update_resource(struct pci_dev *dev, int resno);
>> +void pci_update_resource(struct pci_dev *dev, int resno, bool mmio_force_on);
>>  int __must_check pci_assign_resource(struct pci_dev *dev, int i);
>>  int __must_check pci_reassign_resource(struct pci_dev *dev, int i, resource_size_t add_size, resource_size_t align);
>>  int pci_select_bars(struct pci_dev *dev, unsigned long flags);
>> -- 
>> 2.1.0
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>> the body of a message to majordomo at vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



More information about the Linuxppc-dev mailing list