[PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()

David Hildenbrand david at redhat.com
Thu Apr 9 17:26:22 AEST 2020


On 09.04.20 04:59, piliu wrote:
> 
> 
> On 04/08/2020 10:46 AM, Baoquan He wrote:
>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>
>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>> blocks as removable"), the user space interface to compute whether a memory
>>> block can be offlined (exposed via
>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>
>> Pingfan, can you have a look at this change on PPC?  Please feel free to
>> give comments if any concern, or offer ack if it's OK to you.
>>
>>>
>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>> we'll start by:
>>> 1. Testing if it contains any holes, and reject if so
>>> 2. Testing if pages belong to different zones, and reject if so
>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>
>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>> it can easily result in false positives/negatives. Let's stop manually
>>> checking is_mem_section_removable(), and let device_offline() handle it
>>> completely instead. We can remove the racy is_mem_section_removable()
>>> implementation next.
>>>
>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>> zone lock when isolating), but maybe we should optimize that
>>> implementation instead if this ever becomes a real problem (after all,
>>> memory unplug is already an expensive operation). We started using
>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>> Implement memory hotplug remove in the kernel"), with the initial
>>> hotremove support of lmbs.
>>>
>>> Cc: Nathan Fontenot <nfont at linux.vnet.ibm.com>
>>> Cc: Michael Ellerman <mpe at ellerman.id.au>
>>> Cc: Benjamin Herrenschmidt <benh at kernel.crashing.org>
>>> Cc: Paul Mackerras <paulus at samba.org>
>>> Cc: Michal Hocko <mhocko at suse.com>
>>> Cc: Andrew Morton <akpm at linux-foundation.org>
>>> Cc: Oscar Salvador <osalvador at suse.de>
>>> Cc: Baoquan He <bhe at redhat.com>
>>> Cc: Wei Yang <richard.weiyang at gmail.com>
>>> Signed-off-by: David Hildenbrand <david at redhat.com>
>>> ---
>>>  .../platforms/pseries/hotplug-memory.c        | 26 +++----------------
>>>  1 file changed, 3 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> index b2cde1732301..5ace2f9a277e 100644
>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>>  
>>>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>>>  {
>>> -	int i, scns_per_block;
>>> -	bool rc = true;
>>> -	unsigned long pfn, block_sz;
>>> -	u64 phys_addr;
>>> -
>>>  	if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>>  		return false;
>>>  
>>> -	block_sz = memory_block_size_bytes();
>>> -	scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>> -	phys_addr = lmb->base_addr;
>>> -
>>>  #ifdef CONFIG_FA_DUMP
>>>  	/*
>>>  	 * Don't hot-remove memory that falls in fadump boot memory area
>>>  	 * and memory that is reserved for capturing old kernel memory.
>>>  	 */
>>> -	if (is_fadump_memory_area(phys_addr, block_sz))
>>> +	if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>>  		return false;
>>>  #endif
>>> -
>>> -	for (i = 0; i < scns_per_block; i++) {
>>> -		pfn = PFN_DOWN(phys_addr);
>>> -		if (!pfn_in_present_section(pfn)) {
>>> -			phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> -			continue;
>>> -		}
>>> -
>>> -		rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>> -		phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> -	}
>>> -
>>> -	return rc;
>>> +	/* device_offline() will determine if we can actually remove this lmb */
>>> +	return true;
> So I think here swaps the check and do sequence. At least it breaks
> dlpar_memory_remove_by_count(). It is doable to remove
> is_mem_section_removable(), but here should be more effort to re-arrange
> the code.
> 

Thanks Pingfan,

1. "swaps the check and do sequence":

Partially. Any caller of dlpar_remove_lmb() already has to deal with
false positives. device_offline() can easily fail after
dlpar_remove_lmb() == true. It's inherently racy.

2. "breaks dlpar_memory_remove_by_count()"

Can you elaborate why it "breaks" it? It will simply try to
offline+remove lmbs, detect that it wasn't able to offline+remove as
much as it wanted (which could happen before as well easily), and re-add
the already offlined+removed ones.

3. "more effort to re-arrange the code"

What would be your suggestion?

We would rip out that racy check if we can remove as much memory as
requested in dlpar_memory_remove_by_count() and simply always try to
remove + recover.


-- 
Thanks,

David / dhildenb



More information about the Linuxppc-dev mailing list