[PATCH v6 05/10] mm/memory_hotplug: Shrink zones when offlining memory

David Hildenbrand david at redhat.com
Sun Dec 1 10:43:31 AEDT 2019



> Am 01.12.2019 um 00:22 schrieb Andrew Morton <akpm at linux-foundation.org>:
> 
> On Sun, 27 Oct 2019 23:45:52 +0100 David Hildenbrand <david at redhat.com> wrote:
> 
>> I think I just found an issue with try_offline_node(). 
>> try_offline_node() is pretty much broken already (touches garbage 
>> memmaps and will not considers mixed NIDs within sections), however, 
>> relies on the node span to look for memory sections to probe. So it 
>> seems to rely on the nodes getting shrunk when removing memory, not when 
>> offlining.
>> 
>> As we shrink the node span when offlining now and not when removing, 
>> this can go wrong once we offline the last memory block of the node and 
>> offline the last CPU. We could still have memory around that we could 
>> re-online, however, the node would already be offline. Unlikely, but 
>> possible.
>> 
>> Note that the same is also broken without this patch in case memory is 
>> never onlined. The "pfn_to_nid(pfn) != nid" can easily succeed on the 
>> garbage memmap, resulting in  no memory being detected as belonging to 
>> the node. Also, resize_pgdat_range() is called when onlining memory, not 
>> when adding it. :/ Oh this is so broken :)
>> 
>> The right fix is probably to walk over all memory blocks that could 
>> exist and test if they belong to the nid (if offline, check the 
>> block->nid, if online check all pageblocks). A fix we can then move in 
>> front of this patch.
>> 
>> Will look into this this week.
> 
> And this series shows almost no sign of having been reviewed.  I'll hold
> it over for 5.6.
> 

Makes sense, can‘t do anything about it. Btw, this one is the last stable patch to fix access of uninitialized memmaps that is not upstream yet... so it has to remain broken for some longer.



More information about the Linuxppc-dev mailing list