BAR resizing broken in 6.18 (PPC only?)

Ilpo Järvinen ilpo.jarvinen at linux.intel.com
Fri Oct 24 04:42:56 AEDT 2025


On Wed, 22 Oct 2025, Simon Richter wrote:
> On 10/22/25 1:20 AM, Ilpo Järvinen wrote:
> 
> > Could you please test if the patch below helps.
> 
> Yes, this looks better.
> 
>  - "good" is the 6.17 reference
>  - "shrink" is with this patch and the BAR0 release from Lucas
>  - "bar0" is with this patch, with the bridge BAR0 still mapped (i.e. without
> the patch from Lucas)
> 
> If you compare "good" vs "bar0", the differences are now fairly minimal. The
> non-prefetchable window has shrunk, but assignments are otherwise the same.

If a window has extra size prior to any resource fitting operation, the 
kernel will recalculate the size based on what it knows about the 
downstream resource sizes, no more so extra size is removed.

I thought that old_size was to prevent such shrinkage, but it is 
problematic as we've seen here (and also in a some other cases).

It would be possible to move the max for old_size outside of align so 
something like this instead of the patch you tested:

-       return ALIGN(max(size, old_size), align);
+       return max(ALIGN(size, align), old_size);

That would not try to make the bridge window larger due to alignment than 
what the old_size was, so it should still fit to its old range keeping 
its old size.

> I've added "lspci -v" output as well, which shows the bridge configuration.
> I'm still not sure that the address mappings between PCI and system bus are
> 1:1.
> 
> So the BAR0 release patch from Lucas seems to be no longer required with this,
> although it does align the prefetchable area better, so in theory it would
> allow a 512G BAR to be mapped. In practice, there are no Intel dGPUs with 512G
> VRAM.
>
> > There's indeed something messy and odd going on here with the resource and
> > window mappings, in the bad case there's also this line which doesn't make
> > much sense:
> > +pci 0030:01:00.0: bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit
> > pref]: can't claim; address conflict with 0030:01:00.0 [mem
> > 0x6200020000000-0x62000207fffff 64bit pref]
> 
> > ...but that conflicting resource was not assigned in between releasing
> > this bridge window and trying to claim it back so how did that
> > conflicting resource get there is totally mysterious to me. It doesn't
> > seem related directly to the the resize no longer working though.
> 
> That is the upstream bridge's BAR0 mapping, which is not a bridge window, so
> presumably the window allocation algorithm is unaware of it.

Resource tree is independent of PCI's resource allocation algorithm. Now 
that I look the numbers and logs again, this doesn't look valid resource 
tree state (from iomem.good!):

6200000000000-6203fbfffffff : pciex at 620c3c0000000
  6200000000000-6203fbff0ffff : PCI Bus 0030:01
    6200020000000-62000207fffff : 0030:01:00.0
    6200000000000-6203fbff0ffff : PCI Bus 0030:02
      6200400000000-62007ffffffff : PCI Bus 0030:03
        6200400000000-62007ffffffff : 0030:03:00.0

6200020000000-62000207fffff and 6200000000000-6203fbff0ffff appear as 
siblings and those addresses conflict. It seems this "good" kernel is 
"cheating" by double counting addresses... ;-D

I've now found the cause in part thanks to another reporter with 
similar impossible resource conflicts (an old bug in the resizing 
algorithm which is there since BAR resizing was introduced).

It will take me a few days to fix all this as fixing the claim issue 
will make other domino bricks to fall so I'll have to refactor this 
pci_resize_resource() interface now, unfortunately.

> > > It's a bit weird that there is a log message that says "enabling device",
> > > then
> > > the BARs are reconfigured. I'd want the decoding logic to be inactive
> > > while
> > > addresses are assigned.
> 
> > So no real issue here and only logging is not the way you'd want it?
> 
> It works for the GPU, but I'm unsure about my FPGA designs now, for the most
> part, I would have expected that the "enable memory decoding" bit had to be 0
> while BAR registers are being written, and I would have expected the driver to
> resize the BAR first, then enable the device.

Lucas did move resizing earlier but I guess it still occurs after enabling 
the device. I don't know enough about xe driver to say how early BARs 
could be resized.

-- 
 i.


More information about the Linuxppc-dev mailing list