[PATCH v6 05/10] mm/memory_hotplug: Shrink zones when offlining memory

Andrew Morton akpm at linux-foundation.org
Sun Dec 1 10:21:59 AEDT 2019


On Sun, 27 Oct 2019 23:45:52 +0100 David Hildenbrand <david at redhat.com> wrote:

> I think I just found an issue with try_offline_node(). 
> try_offline_node() is pretty much broken already (touches garbage 
> memmaps and will not considers mixed NIDs within sections), however, 
> relies on the node span to look for memory sections to probe. So it 
> seems to rely on the nodes getting shrunk when removing memory, not when 
> offlining.
> 
> As we shrink the node span when offlining now and not when removing, 
> this can go wrong once we offline the last memory block of the node and 
> offline the last CPU. We could still have memory around that we could 
> re-online, however, the node would already be offline. Unlikely, but 
> possible.
> 
> Note that the same is also broken without this patch in case memory is 
> never onlined. The "pfn_to_nid(pfn) != nid" can easily succeed on the 
> garbage memmap, resulting in  no memory being detected as belonging to 
> the node. Also, resize_pgdat_range() is called when onlining memory, not 
> when adding it. :/ Oh this is so broken :)
> 
> The right fix is probably to walk over all memory blocks that could 
> exist and test if they belong to the nid (if offline, check the 
> block->nid, if online check all pageblocks). A fix we can then move in 
> front of this patch.
> 
> Will look into this this week.

And this series shows almost no sign of having been reviewed.  I'll hold
it over for 5.6.



More information about the Linuxppc-dev mailing list