[PATCH V7 04/17] PCI: Take additional IOV BAR alignment in sizing and assigning

Wei Yang weiyang at linux.vnet.ibm.com
Wed Sep 10 13:27:43 EST 2014


On Tue, Sep 09, 2014 at 02:09:46PM -0600, Bjorn Helgaas wrote:
>On Wed, Aug 20, 2014 at 12:14 AM, Wei Yang <weiyang at linux.vnet.ibm.com> wrote:
>> On Tue, Aug 19, 2014 at 09:08:41PM -0600, Bjorn Helgaas wrote:
>>>On Thu, Jul 24, 2014 at 02:22:14PM +0800, Wei Yang wrote:
>>>> At resource sizing/assigning stage, resources are divided into two lists,
>>>> requested list and additional list, while the alignement of the additional
>>>> IOV BAR is not taken into the sizeing and assigning procedure.
>>>>
>>>> This is reasonable in the original implementation, since IOV BAR's alignment is
>>>> mostly the size of a PF BAR alignemt. This means the alignment is already taken
>>>> into consideration. While this rule may be violated on some platform.
>>>>
>>>> This patch take the additional IOV BAR alignment in sizing and assigning stage
>>>> explicitly.
>>>>
>>>> Signed-off-by: Wei Yang <weiyang at linux.vnet.ibm.com>
>>>> ---
>>>>  drivers/pci/setup-bus.c |   68 +++++++++++++++++++++++++++++++++++++++++------
>>>>  1 file changed, 60 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
>>>> index a5a63ec..d83681f 100644
>>>> --- a/drivers/pci/setup-bus.c
>>>> +++ b/drivers/pci/setup-bus.c
>>>> @@ -120,6 +120,28 @@ static resource_size_t get_res_add_size(struct list_head *head,
>>>>      return 0;
>>>>  }
>>>>
>>>> +static resource_size_t get_res_add_align(struct list_head *head,
>>>> +            struct resource *res)
>>>> +{
>>>> +    struct pci_dev_resource *dev_res;
>>>> +
>>>> +    list_for_each_entry(dev_res, head, list) {
>>>> +            if (dev_res->res == res) {
>>>> +                    int idx = res - &dev_res->dev->resource[0];
>>>> +
>>>> +                    dev_printk(KERN_DEBUG, &dev_res->dev->dev,
>>>> +                               "res[%d]=%pR get_res_add_align min_align %llx\n",
>>>> +                               idx, dev_res->res,
>>>> +                               (unsigned long long)dev_res->min_align);
>>>> +
>>>> +                    return dev_res->min_align;
>>>> +            }
>>>> +    }
>>>> +
>>>> +    return 0;
>>>> +}
>>>
>>>I see that you copied the structure of the existing get_res_add_size()
>>>here.  But I don't understand *that* function.  It looks basically like
>>>this:
>>>
>>>  resource_size_t get_res_add_size(list, res)
>>>  {
>>>    list_for_each_entry(dev_res, head, list) {
>>>      if (dev_res->res == res)
>>>        return dev_res->add_size;
>>>    }
>>>    return 0;
>>>  }
>>>
>>>and we call it like this:
>>>
>>>  dev_res->res->end += get_res_add_size(realloc_head, dev_res->res);
>>>
>>>So we start out with dev_res", pass in dev_res->res, search the
>>>realloc_head list to find dev_res again, and return dev_res->add_size.
>>>That looks equivalent to just:
>>>
>>>  dev_res->res->end += dev_res->add_size;
>>>
>>>It looks like get_res_add_size() merely adds a printk and some complexity.
>>>Am I missing something?
>>>
>>
>> Let me try to explain it, if not correct, please let know :-)
>>
>>   dev_res->res->end += get_res_add_size(realloc_head, dev_res->res);
>>
>> would be expanded to:
>>
>>   dev_res->res->end += dev_res_1->add_size;
>>
>> with the dev_res_1 is another one from dev_res which is stored in realloc_head.
>
>Yep, I see now.
>
>>>I do see that there are other callers where we don't actually start with
>>>dev_res, which makes it a little more complicated.  But I think you should
>>>either add something like this:
>>>
>>>  struct pci_dev_resource *res_to_dev_res(list, res)
>>>  {
>>>    list_for_each_entry(dev_res, head, list) {
>>>      if (dev_res->res == res)
>>>        return dev_res;
>>>    }
>>>    return NULL;
>>>  }
>>>
>>
>> Ok, we can extract the common part of these two functions.
>>
>>>which can be used to replace get_res_add_size() and get_res_add_align(), OR
>>>figure out whether the dev_res of interest is always one we've just added.
>>>If it is, maybe you can just make add_to_list() return the dev_res pointer
>>>instead of an errno, and hang onto the pointer.  I'd like that much better
>>>if that's possible.
>>>
>>
>> Sorry, I don't get this point.
>
>Don't worry, it didn't make sense.  I was thinking that we knew the
>dev_res up front and didn't need to look it up, but that's not the
>case.
>
>Sorry it took me so long to respond to this; I'm a bit swamped dealing
>with some regressions.

:-) Never mind, those regressions are with higher priority then this new
feature.

And I found some bugs in this version during the test, and will merge those
fixes in the next version.

>
>Bjorn
>
>> add_to_list() is used to create the pci_dev_resource list, get_res_add_size()
>> and get_res_add_align() is to retrieve the information in the list. I am not
>> sure how to leverage add_to_list() in these two functions?
>>
>>>> +
>>>> +
>>>>  /* Sort resources by alignment */
>>>>  static void pdev_sort_resources(struct pci_dev *dev, struct list_head *head)
>>>>  {
>>>> @@ -368,8 +390,9 @@ static void __assign_resources_sorted(struct list_head *head,
>>>>      LIST_HEAD(save_head);
>>>>      LIST_HEAD(local_fail_head);
>>>>      struct pci_dev_resource *save_res;
>>>> -    struct pci_dev_resource *dev_res, *tmp_res;
>>>> +    struct pci_dev_resource *dev_res, *tmp_res, *dev_res2;
>>>>      unsigned long fail_type;
>>>> +    resource_size_t add_align, align;
>>>>
>>>>      /* Check if optional add_size is there */
>>>>      if (!realloc_head || list_empty(realloc_head))
>>>> @@ -384,10 +407,31 @@ static void __assign_resources_sorted(struct list_head *head,
>>>>      }
>>>>
>>>>      /* Update res in head list with add_size in realloc_head list */
>>>> -    list_for_each_entry(dev_res, head, list)
>>>> +    list_for_each_entry_safe(dev_res, tmp_res, head, list) {
>>>>              dev_res->res->end += get_res_add_size(realloc_head,
>>>>                                                      dev_res->res);
>>>>
>>>> +            if (!(dev_res->res->flags & IORESOURCE_STARTALIGN))
>>>> +                    continue;
>>>> +
>>>> +            add_align = get_res_add_align(realloc_head, dev_res->res);
>>>> +
>>>> +            if (add_align > dev_res->res->start) {
>>>> +                    dev_res->res->start = add_align;
>>>> +                    dev_res->res->end = add_align +
>>>> +                                        resource_size(dev_res->res);
>>>> +
>>>> +                    list_for_each_entry(dev_res2, head, list) {
>>>> +                            align = pci_resource_alignment(dev_res2->dev,
>>>> +                                                           dev_res2->res);
>>>> +                            if (add_align > align)
>>>> +                                    list_move_tail(&dev_res->list,
>>>> +                                                   &dev_res2->list);
>>>> +                    }
>>>> +               }
>>>> +
>>>> +    }
>>>> +
>>>>      /* Try updated head list with add_size added */
>>>>      assign_requested_resources_sorted(head, &local_fail_head);
>>>>
>>>> @@ -930,6 +974,8 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
>>>>      struct resource *b_res = find_free_bus_resource(bus,
>>>>                                      mask | IORESOURCE_PREFETCH, type);
>>>>      resource_size_t children_add_size = 0;
>>>> +    resource_size_t children_add_align = 0;
>>>> +    resource_size_t add_align = 0;
>>>>
>>>>      if (!b_res)
>>>>              return -ENOSPC;
>>>> @@ -954,6 +1000,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
>>>>                      /* put SRIOV requested res to the optional list */
>>>>                      if (realloc_head && i >= PCI_IOV_RESOURCES &&
>>>>                                      i <= PCI_IOV_RESOURCE_END) {
>>>> +                            add_align = max(pci_resource_alignment(dev, r), add_align);
>>>>                              r->end = r->start - 1;
>>>>                              add_to_list(realloc_head, dev, r, r_size, 0/* don't care */);
>>>>                              children_add_size += r_size;
>>>> @@ -984,8 +1031,11 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
>>>>                      if (order > max_order)
>>>>                              max_order = order;
>>>>
>>>> -                    if (realloc_head)
>>>> +                    if (realloc_head) {
>>>>                              children_add_size += get_res_add_size(realloc_head, r);
>>>> +                            children_add_align = get_res_add_align(realloc_head, r);
>>>> +                            add_align = max(add_align, children_add_align);
>>>> +                    }
>>>>              }
>>>>      }
>>>>
>>>> @@ -996,7 +1046,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
>>>>              add_size = children_add_size;
>>>>      size1 = (!realloc_head || (realloc_head && !add_size)) ? size0 :
>>>>              calculate_memsize(size, min_size, add_size,
>>>> -                            resource_size(b_res), min_align);
>>>> +                            resource_size(b_res), max(min_align, add_align));
>>>>      if (!size0 && !size1) {
>>>>              if (b_res->start || b_res->end)
>>>>                      dev_info(&bus->self->dev, "disabling bridge window %pR to %pR (unused)\n",
>>>> @@ -1008,10 +1058,12 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
>>>>      b_res->end = size0 + min_align - 1;
>>>>      b_res->flags |= IORESOURCE_STARTALIGN;
>>>>      if (size1 > size0 && realloc_head) {
>>>> -            add_to_list(realloc_head, bus->self, b_res, size1-size0, min_align);
>>>> -            dev_printk(KERN_DEBUG, &bus->self->dev, "bridge window %pR to %pR add_size %llx\n",
>>>> -                       b_res, &bus->busn_res,
>>>> -                       (unsigned long long)size1-size0);
>>>> +            add_to_list(realloc_head, bus->self, b_res, size1-size0,
>>>> +                            max(min_align, add_align));
>>>> +            dev_printk(KERN_DEBUG, &bus->self->dev, "bridge window "
>>>> +                             "%pR to %pR add_size %llx add_align %llx\n", b_res,
>>>> +                             &bus->busn_res, (unsigned long long)size1-size0,
>>>> +                             max(min_align, add_align));
>>>
>>>Factor out this "max(min_align, add_align)" thing so we don't have to
>>>change these lines.  Bonus points if you can also factor it out of the
>>>calculate_memsize() call above.  That one is a pretty complicated ternary
>>>expression that should probably be turned into an "if" instead anyway.
>>>
>>
>> Ok, I get your point. Let me make it more easy to read.
>>
>>>>      }
>>>>      return 0;
>>>>  }
>>>> --
>>>> 1.7.9.5
>>>>
>>
>> --
>> Richard Yang
>> Help you, Help me
>>

-- 
Richard Yang
Help you, Help me



More information about the Linuxppc-dev mailing list