[PATCH] powernv/pci: Fix m64 checks for SR-IOV and window alignment

Russell Currey ruscur at russell.cc
Mon Sep 19 16:37:46 AEST 2016


On Wed, 2016-09-14 at 21:30 +1000, Gavin Shan wrote:
> On Wed, Sep 14, 2016 at 05:51:08PM +1000, Benjamin Herrenschmidt wrote:
> > 
> > On Wed, 2016-09-14 at 16:37 +1000, Russell Currey wrote:
> > > 
> > > Commit 5958d19a143e checks for prefetchable m64 BARs by comparing the
> > > addresses instead of using resource flags.  This broke SR-IOV as the
> > > m64
> > > check in pnv_pci_ioda_fixup_iov_resources() fails.
> > > 
> > > The condition in pnv_pci_window_alignment() also changed to checking
> > > only IORESOURCE_MEM_64 instead of both IORESOURCE_MEM_64 and
> > > IORESOURCE_PREFETCH.
> > 
> > CC'ing Gavin who might have some insight in the matter.
> > 
> > Why do we check for prefetch ? On PCIe, any 64-bit BAR can live under a
> > prefetchable region afaik... Gavin, any idea ?
> > 
> 
> Ben, what I understood for long time: non-prefetchable BAR cannot live under
> a prefetchable region (window), but any BAR can live under non-prefetchable
> region (window).
> 
> > 
> > 
> > > 
> > > Revert these cases to the previous behaviour, adding a new helper
> > > function
> > > to do so.  This is named pnv_pci_is_m64_flags() to make it clear this
> > > function is only looking at resource flags and should not be relied
> > > on for
> > > non-SRIOV resources.
> > > 
> > > Fixes: 5958d19a143e ("Fix incorrect PE reservation attempt on some
> > > 64-bit BARs")
> > > Reported-by: Alexey Kardashevskiy <aik at ozlabs.ru>
> > > Signed-off-by: Russell Currey <ruscur at russell.cc>
> > > ---
> > >  arch/powerpc/platforms/powernv/pci-ioda.c | 11 +++++++++--
> > >  1 file changed, 9 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
> > > b/arch/powerpc/platforms/powernv/pci-ioda.c
> > > index c16d790..2f25622 100644
> > > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > > @@ -124,6 +124,13 @@ static inline bool pnv_pci_is_m64(struct pnv_phb
> > > *phb, struct resource *r)
> > >  		r->start < (phb->ioda.m64_base + phb-
> > > > 
> > > > ioda.m64_size));
> > >  }
> > >  
> > > +static inline bool pnv_pci_is_m64_flags(unsigned long
> > > resource_flags)
> > > +{
> > > +	unsigned long flags = (IORESOURCE_MEM_64 |
> > > IORESOURCE_PREFETCH);
> > > +
> > > +	return (resource_flags & flags) == flags;
> > > +}
> > > 
> > I don't agree. See below.
> > 
> > > 
> > >  static struct pnv_ioda_pe *pnv_ioda_init_pe(struct pnv_phb *phb, int
> > > pe_no)
> > >  {
> > >  	phb->ioda.pe_array[pe_no].phb = phb;
> > > @@ -2871,7 +2878,7 @@ static void
> > > pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
> > >  		res = &pdev->resource[i + PCI_IOV_RESOURCES];
> > >  		if (!res->flags || res->parent)
> > >  			continue;
> > > -		if (!pnv_pci_is_m64(phb, res)) {
> > > +		if (!pnv_pci_is_m64_flags(res->flags)) {
> > >  			dev_warn(&pdev->dev, "Don't support SR-IOV
> > > with"
> > >  					" non M64 VF BAR%d: %pR.
> > > \n",
> > >  				 i, res);
> > 
> > What is that function actually doing ? Having IORESOURCE_64 and
> > PREFETCHABLE is completely orthogonal to being in the M64 region. This
> > is the bug my original patch was fixing in fact as it's possible for
> > the allocator to put a 64-bit resource in the M32 region.
> > 
> 
> This function is called before the resoureces are resized and assigned.
> So using the resource's start/end addresses to judge it's in M64 or M32
> windows are not reliable. Currently, all IOV BARs is required to have
> (IORESOURCE_64 | PREFETCHABLE) which is covered by bridge's M64 window
> and PHB's M64 windows (BARs).
> 
> > 
> > > 
> > > @@ -3096,7 +3103,7 @@ static resource_size_t
> > > pnv_pci_window_alignment(struct pci_bus *bus,
> > >  	 * alignment for any 64-bit resource, PCIe doesn't care and
> > >  	 * bridges only do 64-bit prefetchable anyway.
> > >  	 */
> > > -	if (phb->ioda.m64_segsize && (type & IORESOURCE_MEM_64))
> > > +	if (phb->ioda.m64_segsize && pnv_pci_is_m64_flags(type))
> > >  		return phb->ioda.m64_segsize;
> > 
> > I disagree similarly. 64-bit non-prefetchable resources should live in
> > the M64 space as well.
> > 
> 
> As I understood, 64-bits non-prefetchable BARs cannot live behind
> M64 (64-bits prefetchable) windows.
> 
> > 
> > > 
> > >  	if (type & IORESOURCE_MEM)
> > >  		return phb->ioda.m32_segsize;
> > 
> > Something seems to be deeply wrong here and this patch looks to me that
> > it's just papering over the problem in way that could bring back the
> > bugs I've seen if the generic allocator decides to put things in the
> > M32 window.
> > 
> > We need to look at this more closely and understand WTF that code
> > intends means to do.
> > 
> 
> Yeah, it seems it partially reverts your changes. The start/end addresses
> are usable after resource resizing/assignment is finished. Before that,
> we still need to use the flags.

I agree with Ben that we need to look at this more closely to find a proper fix
rather than this hacky partial revert, but for now it's important that we fix
SR-IOV and thus I think this patch should be carried forward.

This patch is a bandaid, but I believe completely fixing the underlying problem
is not achievable given we're at rc7. 

As a side note, I am going to prototype a heavy refactor of the allocation code
that simplifies things from an EEH perspective and allows us to use more generic
PCI code.

> 
> Thanks,
> Gavin
> 
> 
> > 
> > Cheers,
> > Ben.
> > 



More information about the Linuxppc-dev mailing list