[PATCH] powerpc/pci: Add config option for using OF 'reg' for PCI domain

Bjorn Helgaas helgaas at kernel.org
Fri Jun 10 03:10:22 AEST 2022


On Thu, Jun 09, 2022 at 06:27:25PM +0200, Pali Rohár wrote:
> On Thursday 09 June 2022 11:22:55 Bjorn Helgaas wrote:
> > [+cc Guilherme, Michael, Ben (author of 63a72284b159 and PPC folks), thread:
> > https://lore.kernel.org/r/20220504175718.29011-1-pali@kernel.org]
> > 
> > On Fri, May 06, 2022 at 12:33:02AM +0200, Pali Rohár wrote:
> > > On Thursday 05 May 2022 15:10:01 Tyrel Datwyler wrote:
> > > > On 5/5/22 02:31, Pali Rohár wrote:
> > > > > On Thursday 05 May 2022 07:16:40 Christophe Leroy wrote:
> > > > >> Le 04/05/2022 à 19:57, Pali Rohár a écrit :
> > > > >>> Since commit 63a72284b159 ("powerpc/pci: Assign fixed PHB
> > > > >>> number based on device-tree properties"), powerpc kernel
> > > > >>> always fallback to PCI domain assignment from OF / Device Tree
> > > > >>> 'reg' property of the PCI controller.
> > > > >>>
> > > > >>> PCI code for other Linux architectures use increasing
> > > > >>> assignment of the PCI domain for individual controllers
> > > > >>> (assign the first free number), like it was also for powerpc
> > > > >>> prior mentioned commit.
> > > > >>>
> > > > >>> Upgrading powerpc kernels from LTS 4.4 version (which does not
> > > > >>> contain mentioned commit) to new LTS versions brings a
> > > > >>> regression in domain assignment.
> > > > >>
> > > > >> Can you elaborate why it is a regression ?
> > > > >> 63a72284b159 That commit says 'no functionnal changes', I'm
> > > > >> having hard time understanding how a nochange can be a
> > > > >> regression.
> > > > > 
> > > > > It is not 'no functional change'. That commit completely changed
> > > > > PCI domain assignment in a way that is incompatible with other
> > > > > architectures and also incompatible with the way how it was done
> > > > > prior that commit.
> > > > 
> > > > I agree that the "no functional change" statement is incorrect.
> > > > However, for most powerpc platforms it ended up being simply a
> > > > cosmetic behavior change. As far as I can tell there is nothing
> > > > requiring domain ids to increase montonically from zero or that
> > > > each architecture is required to use the same domain numbering
> > > > scheme.
> > > 
> > > That is truth. But it looks really suspicious why domains are not
> > > assigned monotonically. Some scripts / applications are using PCI
> > > location (domain:bus:dev:func) for remembering PCI device and domain
> > > change can cause issue for config files. And some (older) applications
> > > expects existence of domain zero. In systems without hot plug support
> > > with small number of domains (e.g. 3) it means that there are always
> > > domains 0, 1 and 2.
> > > 
> > > > Its hard to call this a true regression unless it actually broke
> > > > something. The commit in question has been in the kernel since 4.8
> > > > which was released over 5 1/2 years ago.
> > > 
> > > I agree, it really depends on how you look at it.
> > > 
> > > The important is that lot of people are using LTS versions and are
> > > doing upgrades when LTS support is dropped. Which for 4.4 now
> > > happened. So not all smaller or "cosmetic" changes could be detected
> > > until longer LTS period pass.
> > > 
> > > > With all that said looking closer at the code in question I think
> > > > it is fair to assume that the author only intended this change for
> > > > powernv and pseries platforms and not every powerpc platform. That
> > > > change was done to make persistent naming easier to manage in
> > > > userspace.
> > > 
> > > I agree that this behavior change may be useful in some situations
> > > and I do not object this need.
> > > 
> > > > Your change defaults back to the old behavior which will now break
> > > > both powernv and pseries platforms with regard to hotplugging and
> > > > persistent naming.
> > > 
> > > I was aware of it, that change could cause issues. And that is why I
> > > added config option for choosing behavior. So users would be able to
> > > choose what they need.
> > > 
> > > > We could properly limit it to powernv and pseries by using
> > > > ibm,fw-phb-id instead of reg property in the look up that follows
> > > > a failed ibm,opal-phbid lookup. I think this is acceptable as long
> > > > as no other powerpc platforms have started using this behavior for
> > > > persistent naming.
> > > 
> > > And what about setting that new config option to enabled by default
> > > for those series?
> > > 
> > > Or is there issue with introduction of the new config option?
> > > 
> > > One of the point is that it is really a good idea to have
> > > similar/same behavior for all linux platforms. And if it cannot be
> > > enabled by default (for backward compatibility) add at least some
> > > option, so new platforms can start using it or users can decide to
> > > switch behavior.
> > 
> > This is a powerpc thing so I'm just kibbitzing a little.
> > 
> > This basically looks like a new config option to selectively revert
> > 63a72284b159.  That seems hard to maintain and doesn't seem like
> > something that needs to be baked into the kernel at compile-time.
> > 
> > The 63a72284b159 commit log says persistent NIC names are tied to PCI
> > domain/bus/dev/fn addresses, which seems like something we should
> > discourage because we can't predict PCI addresses in general.  I
> > assume other platforms typically use udev with MAC addresses or
> > something?
> 
> This is not about ethernet NIC cards only. But affects also WiFi cards
> (which registers phy dev, not netdev) and also all other PCIe cards
> which do not have to be network-based. Hence MAC address or udev does
> not play role there.

What persistent naming mechanism do other platforms use in those
cases?

I forgot to ask before about the actual regression here.  The commit
log says domain numbers are different, but I don't know the connection
from there to something failing.  I assume there's some script or
config file that depends on specific domain numbers?  And that
dependency is (hopefully) powerpc-specific?

Bjorn


More information about the Linuxppc-dev mailing list