[RFC FIX PATCH v0] powerpc,numa: Fix memory_hotplug_max()

Nathan Fontenot nfont at linux.vnet.ibm.com
Thu May 5 01:29:19 AEST 2016


On 04/18/2016 10:54 PM, Bharata B Rao wrote:
> On Sat, Apr 09, 2016 at 03:44:31PM +0530, Bharata B Rao wrote:
>> On Fri, Apr 08, 2016 at 12:27:44AM -0500, Nathan Fontenot wrote:
>>> On 04/06/2016 04:44 AM, Bharata B Rao wrote:
>>>> memory_hotplug_max() uses hot_add_drconf_memory_max() to get maxmimum
>>>> addressable memory by referring to ibm,dyanamic-memory property. There
>>>> are three problems with the current approach:
>>>>
>>>> 1 hot_add_drconf_memory_max() assumes that ibm,dynamic-memory includes
>>>>   all the LMBs of the guest, but that is not true for PowerKVM which
>>>>   populates only DR LMBs (LMBs that can be hotplugged/removed) in that
>>>>   property.
>>>> 2 hot_add_drconf_memory_max() multiplies lmb-size with lmb-count to arrive
>>>>   at the max possible address. Since ibm,dynamic-memory doesn't include
>>>>   RMA LMBs, the address thus obtained will be less than the actual max
>>>>   address. For example, if max possible memory size is 32G, with lmb-size
>>>>   of 256MB there can be 127 LMBs in ibm,dynamic-memory (1 LMB for RMA
>>>>   which won't be present here).  hot_add_drconf_memory_max() would then
>>>>   return the max addressable memory as 127 * 256MB = 31.75GB, the max
>>>>   address should have been 32G which is what ibm,lrdr-capacity shows.
>>>> 3 In PowerKVM, there can be a gap between the end of boot time RAM and
>>>>   beginning of hotplug RAM area. So just multiplying lmb-count with
>>>>   lmb-size will not provide the correct max possible address for PowerKVM.
>>>>
>>>> This patch fixes 1 by using ibm,lrdr-capacity property to return the max
>>>> addressable memory whenever the property is present. Then it fixes 2 & 3
>>>> by fetching the address of the last LMB in ibm,dynamic-memory property.
>>>>
>>>> NOTE: There are some unnecessary changes in the patch because of converting
>>>> spaces to tabs w/o which checkpatch.pl complains.
>>>>
>>>> Signed-off-by: Bharata B Rao <bharata at linux.vnet.ibm.com>
>>>> ---
>>>>  arch/powerpc/mm/numa.c | 29 ++++++++++++++++++++++-------
>>>>  1 file changed, 22 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>>>> index 669a15e..57d5877 100644
>>>> --- a/arch/powerpc/mm/numa.c
>>>> +++ b/arch/powerpc/mm/numa.c
>>>> @@ -1164,17 +1164,32 @@ int hot_add_scn_to_nid(unsigned long scn_addr)
>>>>  static u64 hot_add_drconf_memory_max(void)
>>>>  {
>>>>          struct device_node *memory = NULL;
>>>> -        unsigned int drconf_cell_cnt = 0;
>>>> -        u64 lmb_size = 0;
>>>> +	struct device_node *dn = NULL;
>>>> +	unsigned int drconf_cell_cnt = 0;
>>>> +	u64 lmb_size = 0;
>>>>  	const __be32 *dm = NULL;
>>>> +	const __be64 *lrdr = NULL;
>>>> +	struct of_drconf_cell drmem;
>>>> +
>>>> +	dn = of_find_node_by_path("/rtas");
>>>> +	if (dn) {
>>>> +		lrdr = of_get_property(dn, "ibm,lrdr-capacity", NULL);
>>>> +		of_node_put(dn);
>>>> +		if (lrdr)
>>>> +			return be64_to_cpup(lrdr);
>>>> +	}
>>>>  
>>>>          memory = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
>>>>          if (memory) {
>>>> -                drconf_cell_cnt = of_get_drconf_memory(memory, &dm);
>>>> -                lmb_size = of_get_lmb_size(memory);
>>>> -                of_node_put(memory);
>>>> -        }
>>>> -        return lmb_size * drconf_cell_cnt;
>>>> +		drconf_cell_cnt = of_get_drconf_memory(memory, &dm);
>>>> +		lmb_size = of_get_lmb_size(memory);
>>>> +
>>>> +		/* Advance to the last cell, each cell has 6 32 bit integers */
>>>> +		dm += (drconf_cell_cnt - 1) * 6;
>>>
>>> You could do this as follows to avoid hard-coding 6
>>
>> Can't do that since dm is of type __be32 pointer.
>>
>>> 		dm += (drconf_cell_cnt - 1) * sizeof(struct of_drconf_cell)
>>>
>>>> +		read_drconf_cell(&drmem, &dm);
>>>> +		of_node_put(memory);
>>>> +	}
>>>> +	return drmem.base_addr + lmb_size;
>>>
>>> I assume it is a safe assumption that there will only be 1 RMA LMB?
>>
>> No, I am not assuming RMA to have 1 LMB here. I fetch the last LMB and
>> get the max possible address from it by adding the base address of the
>> last LMB with the lmb_size.
>>
>>>
>>> I do see that the PAPR defines a bit in the flags field for each LMB
>>> in ibm,dynamic-memory as 'reserved'. Is this something you could use
>>> to flag RMA LMBs and put them in the ibm,dynamic-memory property?
>>>
>>> I'm just curious why these LMBs are not in this property.
>>
>> Not sure about both the above observations.
>>
>> Section B.6.6 of LoPAPR mentions "... called the RMA, that is represented
>> by the first value of the reg property of this first /memory node. Additional
>> storage regions may each be represented by their own /memory node that
>> includes dynamic reconfiguration (DR) properties or by an entry in
>> /ibm,dynamic-reconfiguration-memory nodes"
>>
>> Section B.6.6.2 says "All memory which is not subject to dynamic
>> reconfiguration (such as the RMA) is represented in /memory node(s).
>>
>> Currently PowerKVM puts RMA region and boot time memory (which is
>> currently non DR in case of PowerKVM) in to their own /memory nodes and
>> puts only DR LMBs (hot pluggable) RAM into
>> ibm,dynamic-reconfiguration-memory node.
> 
> Nathan - If the fix is correct and the behaviour is as per PAPR, could this
> be pursued further ?
> 

ok, reading your excerpts from the LoPAPR what you're doing sounds correct.

-Nathan



More information about the Linuxppc-dev mailing list