[PATCH v8 20/45] powerpc/powernv: Allocate PE# in reverse order

Gavin Shan gwshan at linux.vnet.ibm.com
Wed Apr 20 11:04:35 AEST 2016


On Tue, Apr 19, 2016 at 01:07:59PM +1000, Alexey Kardashevskiy wrote:
>On 02/17/2016 02:44 PM, Gavin Shan wrote:
>>PE number for one particular PE can be allocated dynamically or
>>reserved according to the consumed M64 (64-bits prefetchable)
>>segments of the PE. The M64 resources, and hence their segments
>>and PE number are assigned/reserved in ascending order. The PE
>>numbers are allocated dynamically in ascending order as well.
>>It's not a problem as the PE numbers are reserved and then
>>allocated all at once in fine order. However, it will introduce
>>conflicts when PCI hotplug is supported: the PE number to be
>>reserved for newly added PE might have been assigned.
>>
>>To resolve above conflicts, this forces the PE number to be
>>allocated dynamically in reverse order. With this patch applied,
>>the PE numbers are reserved in ascending order, but allocated
>>dynamically in reverse order.
>
>
>The patch is probably is ok, the commit log is not - I do not follow it. Some
>PEs are reserved (for what? why does the absolute PE number matter? put it in
>the commit log), that means that the corresponding bits in pe_alloc[] should
>be set so when you will be allocating PEs for a just plugged device, you
>won't pick them and you will pick free ones, and the order should not matter.
>I would think that "reservation" happens once at the boot time so you set
>"used" bits for the reserved PEs then and after that the dynamic allocator
>will skip them.
>

I will enhance the commit log in next revision, perhaps just pick part of
below words: On PHB3, there are 16 M64 BARs in hardware. The last one is
split ovenly into 256 segments. Each segment can be associated/assigned
to fixed PE# (segment#x <-> PE#x) which is how the hardware was designed.
If one plugged PE has M64 (64-bits prefetchable memory) resources, its
PE# is equal to the segment#. Otherwise, the PE# is allocated dynamically
if the PE doesn't contain M64 resource.

The M64 resources are assigned from low to high end, meaning the reserved
PE# (according to the M64 segments) are grown from low to high end. It's
most likely to get a dynamically allocated PE# which should be reserved
because of M64 segment. It's the conflicts the patch tries to resolve.

The PE# reservation doesn't happen once at boot time because it's
unknow how many PEs and how much M64 resources will be hot added.

>
>>
>>Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>>---
>>  arch/powerpc/platforms/powernv/pci-ioda.c | 14 ++++++--------
>>  1 file changed, 6 insertions(+), 8 deletions(-)
>>
>>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>index f182ca7..565725b 100644
>>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>@@ -144,16 +144,14 @@ static void pnv_ioda_reserve_pe(struct pnv_phb *phb, int pe_no)
>>
>>  static struct pnv_ioda_pe *pnv_ioda_alloc_pe(struct pnv_phb *phb)
>>  {
>>-	unsigned long pe;
>>+	unsigned long pe = phb->ioda.total_pe_num - 1;
>>
>>-	do {
>>-		pe = find_next_zero_bit(phb->ioda.pe_alloc,
>>-					phb->ioda.total_pe_num, 0);
>>-		if (pe >= phb->ioda.total_pe_num)
>>-			return NULL;
>>-	} while(test_and_set_bit(pe, phb->ioda.pe_alloc));
>>+	for (pe = phb->ioda.total_pe_num - 1; pe >= 0; pe--) {
>>+		if (!test_and_set_bit(pe, phb->ioda.pe_alloc))
>>+			return pnv_ioda_init_pe(phb, pe);
>>+	}
>>
>>-	return pnv_ioda_init_pe(phb, pe);
>>+	return NULL;
>>  }
>>
>>  static void pnv_ioda_free_pe(struct pnv_ioda_pe *pe)
>>
>
>
>-- 
>Alexey
>



More information about the Linuxppc-dev mailing list