[PATCH v6 05/42] powerpc/powernv: Track IO/M32/M64 segments from PE

Alexey Kardashevskiy aik at ozlabs.ru
Tue Aug 11 12:23:42 AEST 2015


On 08/11/2015 10:03 AM, Gavin Shan wrote:
> On Mon, Aug 10, 2015 at 05:16:40PM +1000, Alexey Kardashevskiy wrote:
>> On 08/06/2015 02:11 PM, Gavin Shan wrote:
>>> The patch is adding 6 bitmaps, three to PE and three to PHB, to track
>>
>> The patch is also removing 2 arrays (io_segmap and m32_segmap), what is that
>> all about? Also, there was no m64_segmap, now there is, needs an explanation
>> may be.
>>
>
> Originally, the bitmaps (io_segmap and m32_segmap) are allocated dynamically.
> Now, they have fixed sizes - 512 bits.
>
> The subject "powerpc/powernv: Track IO/M32/M64 segments from PE" indicates
> why m64_segmap is added.


But before this patch, you somehow managed to keep it working without a map 
for M64, by the same time you needed map for IO and M32. It seems you are 
making things consistent in this patch but it also feels like you do not 
have to do so as M64 did not need a map before and I cannot see why it 
needs one now.


>>
>>> the consumed by one particular PE, which can be released once the PE
>>> is destroyed during PCI unplugging time. Also, we're using fixed
>>> quantity of bits to trace the used IO, M32 and M64 segments by PEs
>>> in one particular PHB.
>>>
>>
>> Out of curiosity - have you considered having just 3 arrays, in PHB, storing
>> PE numbers, and ditching PE's arrays? Does PE itself need to know what PEs it
>> is using? Not sure about this master/slave PEs though.
>>
>
> I don't follow your suggestion. Can you rephrase and explain it a bit more?


Please explains in what situations you need same map in both PHB and PE and 
how you are going to use them. For example, pe::m64_segmap and phb::m64_segmap.

I believe you need to know what segment is used by what PE and that's it 
and having 2 bitmaps is overcomplicated hard to follow. Is there anything 
else what I am missing?



>> It would be easier to read patches if this one was right before
>> [PATCH v6 23/42] powerpc/powernv: Release PEs dynamically
>>
>
> I'll try to reoder the patch, but not expect too much...
>
>>
>>
>>> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
>>> ---
>>>   arch/powerpc/platforms/powernv/pci-ioda.c | 29 +++++++++++++++--------------
>>>   arch/powerpc/platforms/powernv/pci.h      | 18 ++++++++++++++----
>>>   2 files changed, 29 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>>> index e4ac703..78b49a1 100644
>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>>> @@ -388,6 +388,12 @@ static int pnv_ioda_pick_m64_pe(struct pci_bus *bus, bool all)
>>>   			list_add_tail(&pe->list, &master_pe->slaves);
>>>   		}
>>>
>>> +		/* M64 segments consumed by slave PEs are tracked
>>> +		 * by master PE
>>> +		 */
>>> +		set_bit(pe->pe_number, master_pe->m64_segmap);
>>> +		set_bit(pe->pe_number, phb->ioda.m64_segmap);
>>> +
>>>   		/* P7IOC supports M64DT, which helps mapping M64 segment
>>>   		 * to one particular PE#. However, PHB3 has fixed mapping
>>>   		 * between M64 segment and PE#. In order to have same logic
>>> @@ -2871,9 +2877,11 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
>>>
>>>   			while (index < phb->ioda.total_pe &&
>>>   			       region.start <= region.end) {
>>> -				phb->ioda.io_segmap[index] = pe->pe_number;
>>> +				set_bit(index, pe->io_segmap);
>>> +				set_bit(index, phb->ioda.io_segmap);
>>>   				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
>>> -					pe->pe_number, OPAL_IO_WINDOW_TYPE, 0, index);
>>> +					pe->pe_number, OPAL_IO_WINDOW_TYPE,
>>> +					0, index);
>>
>> Unrelated change.
>>
>
> True, will drop. However, checkpatch.pl will complain wtih:
> exceeding 80 characters.

It will not as you are not changing these lines, it only complains on changes.



>
>>>   				if (rc != OPAL_SUCCESS) {
>>>   					pr_err("%s: OPAL error %d when mapping IO "
>>>   					       "segment #%d to PE#%d\n",
>>> @@ -2896,9 +2904,11 @@ static void pnv_ioda_setup_pe_seg(struct pci_controller *hose,
>>>
>>>   			while (index < phb->ioda.total_pe &&
>>>   			       region.start <= region.end) {
>>> -				phb->ioda.m32_segmap[index] = pe->pe_number;
>>> +				set_bit(index, pe->m32_segmap);
>>> +				set_bit(index, phb->ioda.m32_segmap);
>>>   				rc = opal_pci_map_pe_mmio_window(phb->opal_id,
>>> -					pe->pe_number, OPAL_M32_WINDOW_TYPE, 0, index);
>>> +					pe->pe_number, OPAL_M32_WINDOW_TYPE,
>>> +					0, index);
>>
>> Unrelated change.
>>
>
> same as above.
>
>>>   				if (rc != OPAL_SUCCESS) {
>>>   					pr_err("%s: OPAL error %d when mapping M32 "
>>>   					       "segment#%d to PE#%d",
>>> @@ -3090,7 +3100,7 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>>>   {
>>>   	struct pci_controller *hose;
>>>   	struct pnv_phb *phb;
>>> -	unsigned long size, m32map_off, pemap_off, iomap_off = 0;
>>> +	unsigned long size, pemap_off;
>>>   	const __be64 *prop64;
>>>   	const __be32 *prop32;
>>>   	int len;
>>> @@ -3175,19 +3185,10 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>>>
>>>   	/* Allocate aux data & arrays. We don't have IO ports on PHB3 */
>>
>>
>> This comment came with if(IODA1) below, since you are removing the condition
>> below, makes sense to remove the comment as well or move it where people will
>> look for it (arch/powerpc/platforms/powernv/pci.h ?)
>>
>
> Yes, will do.
>
>>
>>>   	size = _ALIGN_UP(phb->ioda.total_pe / 8, sizeof(unsigned long));
>>> -	m32map_off = size;
>>> -	size += phb->ioda.total_pe * sizeof(phb->ioda.m32_segmap[0]);
>>> -	if (phb->type == PNV_PHB_IODA1) {
>>> -		iomap_off = size;
>>> -		size += phb->ioda.total_pe * sizeof(phb->ioda.io_segmap[0]);
>>> -	}
>>>   	pemap_off = size;
>>>   	size += phb->ioda.total_pe * sizeof(struct pnv_ioda_pe);
>>>   	aux = memblock_virt_alloc(size, 0);
>>
>>
>> After adding static arrays to PE and PHB, do you still need this "aux"?
>>
>
> "aux" is still needed to tell the boundary of pe_alloc_bitmap and pe_array.
>>
>>>   	phb->ioda.pe_alloc = aux;
>>> -	phb->ioda.m32_segmap = aux + m32map_off;
>>> -	if (phb->type == PNV_PHB_IODA1)
>>> -		phb->ioda.io_segmap = aux + iomap_off;
>>>   	phb->ioda.pe_array = aux + pemap_off;
>>>   	set_bit(phb->ioda.reserved_pe, phb->ioda.pe_alloc);
>>>
>>> diff --git a/arch/powerpc/platforms/powernv/pci.h b/arch/powerpc/platforms/powernv/pci.h
>>> index 62239b1..08a4e57 100644
>>> --- a/arch/powerpc/platforms/powernv/pci.h
>>> +++ b/arch/powerpc/platforms/powernv/pci.h
>>> @@ -49,6 +49,15 @@ struct pnv_ioda_pe {
>>>   	/* PE number */
>>>   	unsigned int		pe_number;
>>>
>>> +	/* IO/M32/M64 segments consumed by the PE. Each PE can
>>> +	 * have one M64 segment at most, but M64 segments consumed
>>> +	 * by slave PEs will be contributed to the master PE. One
>>> +	 * PE can own multiple IO and M32 segments.
>>
>>
>> A PE can have multiple IO and M32 segments but just one M64 segment? Is this
>> correct for IODA1 or IODA2 or both? Is this a limitation of this
>> implementation or it comes from P7IOC/PHB3 hardware?
>>
>
> It's correct for IO and M32. However, on IODA1 or IODA2, one PE can have
> multiple M64 segments as well.


But the comment says "Each PE can have one M64 segment at most". Which 
statement is correct?


>>> +	 */
>>> +	unsigned long		io_segmap[8];
>>> +	unsigned long		m32_segmap[8];
>>> +	unsigned long		m64_segmap[8];
>>
>> Magic constant "8", 64bit*8 = 512 PEs - where did this come from?
>>
>> Anyway,
>>
>> #define PNV_IODA_MAX_PE_NUM	512
>>
>> unsigned long io_segmap[PNV_IODA_MAX_PE_NUM/BITS_PER_LONG]
>>
>
> I prefer "8", not macro for 3 reasons:
> - The macro won't be used in the code.

You will use it 6 times in the header, if you give it a good name, people 
won't have to guess if the meaning of all these "8"s is the same and you 
won't have to comment every use of it in this header file (now you have).

Also, using BITS_PER_LONG tells the reader that this is a bitmask for sure.


> - The total segment number of specific resource is variable
>    on IODA1 and IODA2. I just choosed the max value with margin.
> - PNV_IODA_MAX_PE_NUM, indicating max PE number, isn't 512 on
>    IODA1 or IODA2.

Give it a better name.


>
>>> +
>>>   	/* "Weight" assigned to the PE for the sake of DMA resource
>>>   	 * allocations
>>>   	 */
>>> @@ -145,15 +154,16 @@ struct pnv_phb {
>>>   			unsigned int		io_segsize;
>>>   			unsigned int		io_pci_base;
>>>
>>> +			/* IO, M32, M64 segment maps */
>>> +			unsigned long		io_segmap[8];
>>> +			unsigned long		m32_segmap[8];
>>> +			unsigned long		m64_segmap[8];
>>> +
>>>   			/* PE allocation */
>>>   			struct mutex		pe_alloc_mutex;
>>>   			unsigned long		*pe_alloc;
>>>   			struct pnv_ioda_pe	*pe_array;
>>>
>>> -			/* M32 & IO segment maps */
>>> -			unsigned int		*m32_segmap;
>>> -			unsigned int		*io_segmap;
>>> -
>>>   			/* IRQ chip */
>>>   			int			irq_chip_init;
>>>   			struct irq_chip		irq_chip;
>>>
>
> Thanks,
> Gavin
>


-- 
Alexey


More information about the Linuxppc-dev mailing list