[RFC PATCH v3 2/13] memory-hotplug : add physical memory hotplug code to acpi_memory_device_remove

Yasuaki Ishimatsu isimatu.yasuaki at jp.fujitsu.com
Tue Jul 17 14:51:52 EST 2012


Hi Wen,

2012/07/17 12:32, Wen Congyang wrote:
> At 07/17/2012 11:08 AM, Yasuaki Ishimatsu Wrote:
>> Hi Wen,
>>
>> 2012/07/17 11:32, Wen Congyang wrote:
>>> At 07/17/2012 09:54 AM, Yasuaki Ishimatsu Wrote:
>>>> Hi Wen,
>>>>
>>>> 2012/07/17 10:44, Yasuaki Ishimatsu wrote:
>>>>> Hi Wen,
>>>>>
>>>>> 2012/07/13 12:35, Wen Congyang wrote:
>>>>>> At 07/09/2012 06:24 PM, Yasuaki Ishimatsu Wrote:
>>>>>>> acpi_memory_device_remove() has been prepared to remove physical memory.
>>>>>>> But, the function only frees acpi_memory_device currentlry.
>>>>>>>
>>>>>>> The patch adds following functions into acpi_memory_device_remove():
>>>>>>>       - offline memory
>>>>>>>       - remove physical memory (only return -EBUSY)
>>>>>>>       - free acpi_memory_device
>>>>>>>
>>>>>>> CC: David Rientjes <rientjes at google.com>
>>>>>>> CC: Jiang Liu <liuj97 at gmail.com>
>>>>>>> CC: Len Brown <len.brown at intel.com>
>>>>>>> CC: Benjamin Herrenschmidt <benh at kernel.crashing.org>
>>>>>>> CC: Paul Mackerras <paulus at samba.org>
>>>>>>> CC: Christoph Lameter <cl at linux.com>
>>>>>>> Cc: Minchan Kim <minchan.kim at gmail.com>
>>>>>>> CC: Andrew Morton <akpm at linux-foundation.org>
>>>>>>> CC: KOSAKI Motohiro <kosaki.motohiro at jp.fujitsu.com>
>>>>>>> CC: Wen Congyang <wency at cn.fujitsu.com>
>>>>>>> Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki at jp.fujitsu.com>
>>>>>>>
>>>>>>> ---
>>>>>>>      drivers/acpi/acpi_memhotplug.c |   26 +++++++++++++++++++++++++-
>>>>>>>      drivers/base/memory.c          |   39 +++++++++++++++++++++++++++++++++++++++
>>>>>>>      include/linux/memory.h         |    5 +++++
>>>>>>>      include/linux/memory_hotplug.h |    1 +
>>>>>>>      mm/memory_hotplug.c            |    8 ++++++++
>>>>>>>      5 files changed, 78 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> Index: linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c
>>>>>>> ===================================================================
>>>>>>> --- linux-3.5-rc6.orig/drivers/acpi/acpi_memhotplug.c	2012-07-09 18:08:29.946888653 +0900
>>>>>>> +++ linux-3.5-rc6/drivers/acpi/acpi_memhotplug.c	2012-07-09 18:08:43.470719531 +0900
>>>>>>> @@ -29,6 +29,7 @@
>>>>>>>      #include <linux/module.h>
>>>>>>>      #include <linux/init.h>
>>>>>>>      #include <linux/types.h>
>>>>>>> +#include <linux/memory.h>
>>>>>>>      #include <linux/memory_hotplug.h>
>>>>>>>      #include <linux/slab.h>
>>>>>>>      #include <acpi/acpi_drivers.h>
>>>>>>> @@ -452,12 +453,35 @@ static int acpi_memory_device_add(struct
>>>>>>>      static int acpi_memory_device_remove(struct acpi_device *device, int type)
>>>>>>>      {
>>>>>>>      	struct acpi_memory_device *mem_device = NULL;
>>>>>>> -
>>>>>>> +	struct acpi_memory_info *info, *tmp;
>>>>>>> +	int result;
>>>>>>> +	int node;
>>>>>>>
>>>>>>>      	if (!device || !acpi_driver_data(device))
>>>>>>>      		return -EINVAL;
>>>>>>>
>>>>>>>      	mem_device = acpi_driver_data(device);
>>>>>>> +
>>>>>>> +	node = acpi_get_node(mem_device->device->handle);
>>>>>>> +
>>>>>>> +	list_for_each_entry_safe(info, tmp, &mem_device->res_list, list) {
>>>>>>> +		if (!info->enabled)
>>>>>>> +			continue;
>>>>>>> +
>>>>>>> +		if (!is_memblk_offline(info->start_addr, info->length)) {
>>>>>>> +			result = offline_memory(info->start_addr, info->length);
>>>>>>> +			if (result)
>>>>>>> +				return result;
>>>>>>> +		}
>>>>>>> +
>>>>>>> +		result = remove_memory(node, info->start_addr, info->length);
>>>>>>
>>>>>> The user may online the memory between offline_memory() and remove_memory().
>>>>>> So I think we should lock memory hotplug before check the memory's status
>>>>>> and release it after remove_memory().
>>>>>
>>>>> How about get "mem_block->state_mutex" of removed memory? When offlining
>>>>> memory, we need to change "memory_block->state" into "MEM_OFFLINE".
>>>>> In this case, we get mem_block->state_mutex. So I think the mutex lock
>>>>> is beneficial.
>>>>
>>>> It is not good idea since remove_memory frees mem_block structure...
>>>> Do you have any ideas?
>>>
>>> Hmm, split offline_memory() to 2 functions: offline_pages() and __offline_pages()
>>>
>>> offline_pages()
>>> 	lock_memory_hotplug();
>>> 	__offline_pages();
>>> 	unlock_memory_hotplug();
>>>
>>> and implement remove_memory() like this:
>>> remove_memory()
>>> 	lock_memory_hotplug()
>>> 	if (!is_memblk_offline()) {
>>> 		__offline_pages();
>>> 	}
>>> 	// cleanup
>>> 	unlock_memory_hotplug();
>>>
>>> What about this?
>>
>> I also thought about it once. But a problem remains. Current offilne_pages()
>> cannot realize the memory has been removed by remove_memory(). So even if
>> protecting the race by lock_memory_hotplug(), offline_pages() can offline
>> the removed memory. offline_pages() should have the means to know the memory
>> was removed. But I don't have good idea.
> 
> We can not online/offline part of memory block, so what about this?

It seems you do not understand my concern.
When memory_remove() and offline_pages() run to same memory simultaneously,
offline_pages runs to removed memory.

memory_remove()              | offline_pages()
-----------------------------------------------------------
lock_memory_hotplug()        |
                             | wait at lock_memory_hotplug()
remove memory                |
unlock_memory_hotplug()      |
                             | wake up and start offline_pages()
			     | offline page
			     | => but the memory has already removed
			     |    by memory_remove()

In this case, offline_page() may access removed memory.

Thanks,
Yasuaki Ishimatsu

> 
> remove_memory()
> 	lock_memory_hotplug()
> 	for each memory block:
> 		if (!is_memblk_offline()) {
> 			__offline_pages();
> 		}
> 	// cleanup
> 	unlock_memory_hotplug();
> 
> Thanks
> Wen Congyang
>>
>> Thanks,
>> Yasuaki Ishimatsu
>>
>>>
>>> Thanks
>>> Wen Congyang
>>>>
>>>> Thanks,
>>>> Yasuaki Ishimatsu
>>>>
>>>>> Thanks,
>>>>> Yasuaki Ishimatsu
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Wen Congyang
>>>>>>
>>>>>>> +		if (result)
>>>>>>> +			return result;
>>>>>>> +
>>>>>>> +		list_del(&info->list);
>>>>>>> +		kfree(info);
>>>>>>> +	}
>>>>>>> +
>>>>>>>      	kfree(mem_device);
>>>>>>>
>>>>>>>      	return 0;
>>>>>>> Index: linux-3.5-rc6/include/linux/memory_hotplug.h
>>>>>>> ===================================================================
>>>>>>> --- linux-3.5-rc6.orig/include/linux/memory_hotplug.h	2012-07-09 18:08:29.955888542 +0900
>>>>>>> +++ linux-3.5-rc6/include/linux/memory_hotplug.h	2012-07-09 18:08:43.471719518 +0900
>>>>>>> @@ -233,6 +233,7 @@ static inline int is_mem_section_removab
>>>>>>>      extern int mem_online_node(int nid);
>>>>>>>      extern int add_memory(int nid, u64 start, u64 size);
>>>>>>>      extern int arch_add_memory(int nid, u64 start, u64 size);
>>>>>>> +extern int remove_memory(int nid, u64 start, u64 size);
>>>>>>>      extern int offline_memory(u64 start, u64 size);
>>>>>>>      extern int sparse_add_one_section(struct zone *zone, unsigned long start_pfn,
>>>>>>>      								int nr_pages);
>>>>>>> Index: linux-3.5-rc6/mm/memory_hotplug.c
>>>>>>> ===================================================================
>>>>>>> --- linux-3.5-rc6.orig/mm/memory_hotplug.c	2012-07-09 18:08:29.953888567 +0900
>>>>>>> +++ linux-3.5-rc6/mm/memory_hotplug.c	2012-07-09 18:08:43.476719455 +0900
>>>>>>> @@ -659,6 +659,14 @@ out:
>>>>>>>      }
>>>>>>>      EXPORT_SYMBOL_GPL(add_memory);
>>>>>>>
>>>>>>> +int remove_memory(int nid, u64 start, u64 size)
>>>>>>> +{
>>>>>>> +	return -EBUSY;
>>>>>>> +
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL_GPL(remove_memory);
>>>>>>> +
>>>>>>> +
>>>>>>>      #ifdef CONFIG_MEMORY_HOTREMOVE
>>>>>>>      /*
>>>>>>>       * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
>>>>>>> Index: linux-3.5-rc6/drivers/base/memory.c
>>>>>>> ===================================================================
>>>>>>> --- linux-3.5-rc6.orig/drivers/base/memory.c	2012-07-09 18:08:29.947888640 +0900
>>>>>>> +++ linux-3.5-rc6/drivers/base/memory.c	2012-07-09 18:10:54.880076739 +0900
>>>>>>> @@ -70,6 +70,45 @@ void unregister_memory_isolate_notifier(
>>>>>>>      }
>>>>>>>      EXPORT_SYMBOL(unregister_memory_isolate_notifier);
>>>>>>>
>>>>>>> +bool is_memblk_offline(unsigned long start, unsigned long size)
>>>>>>> +{
>>>>>>> +	struct memory_block *mem = NULL;
>>>>>>> +	struct mem_section *section;
>>>>>>> +	unsigned long start_pfn, end_pfn;
>>>>>>> +	unsigned long pfn, section_nr;
>>>>>>> +
>>>>>>> +	start_pfn = PFN_DOWN(start);
>>>>>>> +	end_pfn = start_pfn + PFN_DOWN(start);
>>>>>>> +
>>>>>>> +	for (pfn = start_pfn; pfn < end_pfn; pfn += PAGES_PER_SECTION) {
>>>>>>> +		section_nr = pfn_to_section_nr(pfn);
>>>>>>> +		if (!present_section_nr(section_nr));
>>>>>>> +			continue;
>>>>>>> +
>>>>>>> +		section = __nr_to_section(section_nr);
>>>>>>> +		/* same memblock? */
>>>>>>> +		if (mem)
>>>>>>> +			if((section_nr >= mem->start_section_nr) &&
>>>>>>> +			   (section_nr <= mem->end_section_nr))
>>>>>>> +				continue;
>>>>>>> +
>>>>>>> +		mem = find_memory_block_hinted(section, mem);
>>>>>>> +		if (!mem)
>>>>>>> +			continue;
>>>>>>> +		if (mem->state == MEM_OFFLINE)
>>>>>>> +			continue;
>>>>>>> +
>>>>>>> +		kobject_put(&mem->dev.kobj);
>>>>>>> +		return false;
>>>>>>> +	}
>>>>>>> +
>>>>>>> +	if (mem)
>>>>>>> +		kobject_put(&mem->dev.kobj);
>>>>>>> +
>>>>>>> +	return true;
>>>>>>> +}
>>>>>>> +EXPORT_SYMBOL(is_memblk_offline);
>>>>>>> +
>>>>>>>      /*
>>>>>>>       * register_memory - Setup a sysfs device for a memory block
>>>>>>>       */
>>>>>>> Index: linux-3.5-rc6/include/linux/memory.h
>>>>>>> ===================================================================
>>>>>>> --- linux-3.5-rc6.orig/include/linux/memory.h	2012-07-08 09:23:56.000000000 +0900
>>>>>>> +++ linux-3.5-rc6/include/linux/memory.h	2012-07-09 18:08:43.484719355 +0900
>>>>>>> @@ -106,6 +106,10 @@ static inline int memory_isolate_notify(
>>>>>>>      {
>>>>>>>      	return 0;
>>>>>>>      }
>>>>>>> +static inline bool is_memblk_offline(unsigned long start, unsigned long size)
>>>>>>> +{
>>>>>>> +	return false;
>>>>>>> +}
>>>>>>>      #else
>>>>>>>      extern int register_memory_notifier(struct notifier_block *nb);
>>>>>>>      extern void unregister_memory_notifier(struct notifier_block *nb);
>>>>>>> @@ -120,6 +124,7 @@ extern int memory_isolate_notify(unsigne
>>>>>>>      extern struct memory_block *find_memory_block_hinted(struct mem_section *,
>>>>>>>      							struct memory_block *);
>>>>>>>      extern struct memory_block *find_memory_block(struct mem_section *);
>>>>>>> +extern bool is_memblk_offline(unsigned long start, unsigned long size);
>>>>>>>      #define CONFIG_MEM_BLOCK_SIZE	(PAGES_PER_SECTION<<PAGE_SHIFT)
>>>>>>>      enum mem_add_context { BOOT, HOTPLUG };
>>>>>>>      #endif /* CONFIG_MEMORY_HOTPLUG_SPARSE */
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>>>>> the body to majordomo at kvack.org.  For more info on Linux MM,
>>>>>> see: http://www.linux-mm.org/ .
>>>>>> Don't email: <a href=mailto:"dont at kvack.org"> email at kvack.org </a>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>>>>> the body of a message to majordomo at vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
>>>> the body of a message to majordomo at vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>
>>
>>
>>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo at kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont at kvack.org"> email at kvack.org </a>
> 





More information about the Linuxppc-dev mailing list