<div dir="ltr">2017-04-15 6:54 GMT+08:00 Bjorn Helgaas <span dir="ltr"><<a href="mailto:helgaas@kernel.org" target="_blank">helgaas@kernel.org</a>></span>:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span>On Mon, Apr 10, 2017 at 07:58:14PM +0800, Yongji Xie wrote:<br>
> Currently we reassign the alignment by extending resources' size in<br>
> pci_reassigndev_resource_align<wbr>ment(). This could potentially break<br>
> some drivers when the driver uses the size to locate register<br>
> whose length is related to the size. Some examples as below:<br>
><br>
> - misc/Hpilo.c:<br>
<br>
</span>If you're going to quote filenames, they need to match the actual<br>
files in the tree. In this case, "hpilo.c", not "Hpilo.c". Also<br>
"Nes_hw.c" below is incorrect.<br>
<span><br></span></blockquote><div><br></div><div>Will fix it.<br> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span>
> off = pci_resource_len(pdev, bar) - 0x2000;<br>
><br>
> - net/ethernet/chelsio/cxgb4/cxg<wbr>b4_uld.h:<br>
> (pci_resource_len((pdev), 2) - roundup_pow_of_two((vres)->ocq<wbr>.size))<br>
><br>
> - infiniband/hw/nes/Nes_hw.c:<br>
> num_pds = pci_resource_len(nesdev->pcide<wbr>v, BAR_1) >> PAGE_SHIFT;<br>
<br>
</span>This BAR is apparently at least a page in size already, so this only<br>
breaks if we request alignment of more than a page size.<br>
<span><br></span></blockquote><div><br>Oh, yes. I'll remove this one.<br><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span>
> This risk could be easily prevented before because we only had one way<br>
> (kernel parameter resource_alignment) to touch those codes. And even<br>
> some users may be happy to see the extended size.<br>
<br>
</span>Currently, pci_reassigndev_resource_align<wbr>ment() extends the size of<br>
some resources when we use "pci=resource_alignment=..." That<br>
certainly breaks places like the ones above.<br>
<br>
What do you mean by "some users may be happy to see the extended<br>
size"? We're increasing a resource size to something larger than what<br>
the BAR actually responds to. I don't see how that can ever be an<br>
*improvement*. I can see how most drivers won't care, and a few (like<br>
the ones you cite above) will break. But I don't see how any driver<br>
could be *helped* by us making the resource larger.<br>
<div><div class="gmail-m_-839508379538751553h5"><br></div></div></blockquote><div><br><div>From commit 32a9a68 (PCI: allow assignment of memory resources with a<br>specified alignment), it seems that "pci=resource_alignment" was introduced<br>because we want to use PCI pass-through for some page-unaligned devices.<br></div><div><br></div><div>So there might be some cases that users use "pci=resource_alignment" to<br></div><div>extend the BAR's size then passthrough them. That's why I say "some users<br>may be happy to see the extended size". We are not able to passthrough<br></div>the device unless we extend its BARs to PAGE_SIZE.<br> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="gmail-m_-839508379538751553h5">
> But now we introduce pcibios_default_alignment() to set default alignment<br>
> for all PCI devices which would also touch those codes. It would be hard<br>
> to prevent the risk in this case. So this patch tries to use<br>
> START_ALIGNMENT to identify the resource's alignment without extending<br>
> the size when the alignment reassigning is caused by the default alignment.<br>
><br>
> Signed-off-by: Yongji Xie <<a href="mailto:elohimes@gmail.com" target="_blank">elohimes@gmail.com</a>><br>
> ---<br>
> drivers/pci/pci.c | 34 ++++++++++++++++++++++++------<wbr>----<br>
> 1 file changed, 24 insertions(+), 10 deletions(-)<br>
><br>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c<br>
> index 02f1255..358366e 100644<br>
> --- a/drivers/pci/pci.c<br>
> +++ b/drivers/pci/pci.c<br>
> @@ -4959,11 +4959,13 @@ resource_size_t __weak pcibios_default_alignment(stru<wbr>ct pci_dev *dev)<br>
> /**<br>
> * pci_specified_resource_alignme<wbr>nt - get resource alignment specified by user.<br>
> * @dev: the PCI device to get<br>
> + * @resize: whether or not to change resources' size when reassigning alignment<br>
> *<br>
> * RETURNS: Resource alignment if it is specified.<br>
> * Zero if it is not specified.<br>
> */<br>
> -static resource_size_t pci_specified_resource_alignme<wbr>nt(struct pci_dev *dev)<br>
> +static resource_size_t pci_specified_resource_alignme<wbr>nt(struct pci_dev *dev,<br>
> + bool *resize)<br>
> {<br>
> int seg, bus, slot, func, align_order, count;<br>
> unsigned short vendor, device, subsystem_vendor, subsystem_device;<br>
> @@ -5005,6 +5007,7 @@ static resource_size_t pci_specified_resource_alignme<wbr>nt(struct pci_dev *dev)<br>
> (!device || (device == dev->device)) &&<br>
> (!subsystem_vendor || (subsystem_vendor == dev->subsystem_vendor)) &&<br>
> (!subsystem_device || (subsystem_device == dev->subsystem_device))) {<br>
> + *resize = true;<br>
> if (align_order == -1)<br>
> align = PAGE_SIZE;<br>
> else<br>
> @@ -5030,6 +5033,7 @@ static resource_size_t pci_specified_resource_alignme<wbr>nt(struct pci_dev *dev)<br>
> bus == dev->bus->number &&<br>
> slot == PCI_SLOT(dev->devfn) &&<br>
> func == PCI_FUNC(dev->devfn)) {<br>
> + *resize = true;<br>
> if (align_order == -1)<br>
> align = PAGE_SIZE;<br>
> else<br>
> @@ -5062,6 +5066,7 @@ void pci_reassigndev_resource_align<wbr>ment(struct pci_dev *dev)<br>
> struct resource *r;<br>
> resource_size_t align, size;<br>
> u16 command;<br>
> + bool resize = false;<br>
><br>
> /*<br>
> * VF BARs are read-only zero according to SR-IOV spec r1.1, sec<br>
> @@ -5073,7 +5078,7 @@ void pci_reassigndev_resource_align<wbr>ment(struct pci_dev *dev)<br>
> return;<br>
><br>
> /* check if specified PCI is target device to reassign */<br>
> - align = pci_specified_resource_alignme<wbr>nt(dev);<br>
> + align = pci_specified_resource_alignme<wbr>nt(dev, &resize);<br>
> if (!align)<br>
> return;<br>
><br>
> @@ -5101,15 +5106,24 @@ void pci_reassigndev_resource_align<wbr>ment(struct pci_dev *dev)<br>
> }<br>
><br>
> size = resource_size(r);<br>
> - if (size < align) {<br>
> - size = align;<br>
> - dev_info(&dev->dev,<br>
> - "Rounding up size of resource #%d to %#llx.\n",<br>
> - i, (unsigned long long)size);<br>
> + if (resize) {<br>
> + if (size < align) {<br>
> + size = align;<br>
> + dev_info(&dev->dev,<br>
> + "Rounding up size of resource #%d to %#llx.\n",<br>
> + i, (unsigned long long)size);<br>
> + }<br>
> + r->flags |= IORESOURCE_UNSET;<br>
> + r->start = 0;<br>
> + } else {<br>
> + if (size < align) {<br>
> + r->flags &= ~IORESOURCE_SIZEALIGN;<br>
> + r->flags |= IORESOURCE_STARTALIGN |<br>
> + IORESOURCE_UNSET;<br>
> + r->start = align;<br>
> + }<br>
<br>
</div></div>I'm still not happy about the fact that we now have these two cases:<br>
<br>
1) If we increase resource alignment because of the<br>
"pci=resource_alignment=" parameter, we increase the resource<br>
size, and<br>
<br>
2) If we increase resource alignment because of<br>
pcibios_default_alignment(), we use IORESOURCE_STARTALIGN and do<br>
not increase the resource size.<br>
<br>
This makes the code complicated and hard to maintain in the future.<br>
We talked about this earlier [1], but I'm not sure I've figured out<br>
the reason. Here's my guess:<br>
<br>
Let's assume we have a 4K BAR and we're requesting alignment to 64K.<br>
In case 1, we're only changing the alignment for specified devices.<br>
If we used IORESOURCE_STARTALIGN, we would align that 4K BAR of the<br>
specified devices to 64K, but BARs of other devices could still be<br>
assigned in that same 64K page. Therefore, we must increase the size<br>
to prevent other BARs in the page, even though this might break some<br>
drivers.<br>
<br>
But in case 2, we're changing the alignment for *all* devices. In<br>
this case, we *can* use IORESOURCE_STARTALIGN because we're going to<br>
align *every* BAR of every device to 64K, so no other BARs will be put<br>
in the same page. And since we didn't change the resource size, we<br>
avoid the problem of breaking drivers.<br>
<br>
If that's the reasoning, I'm not 100% sure that the complexity of this<br>
code is worth it. It took me half a day to come up with this theory.<br>
If we do have to go this route, we *must* include comments along this<br>
line to make it easier next time.<br>
<br></blockquote><div><br><div>Your guess is <span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight">exactly correct. We have two problems when we need to change<br>one BAR's alignment:<br><br></span></div><div><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight">1) If we extend the BAR's size, we may break the driver<br><br></span></div><div><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight">2) If we only change the alignment without extending size, we have no way to<br>prevent other BARs being assigned into the same page.<br><br>I failed to come up with a way to use only one way to satisfy the two cases.<br></span></div><div><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight">So I handle the two cases </span><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight">separately like those codes. But as you said, this<br>makes the code hard to maintain. I think what I can do next is to</span><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight"><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight"> add some<br>comments</span><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight"> to make the codes easier to read and maintain.</span> Of course, if<br>anybody have a better idea, I'll be happy to see it. </span><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight"></span><span class="gmail-m_1255938235226888354gmail-op_dict3_font24 gmail-m_1255938235226888354gmail-op_dict3_marginRight"></span></div><br><div class="gmail_quote">Thanks,<br></div>Yongji <br></div></div></div></div>