[PATCH v6 05/10] mm/memory_hotplug: Shrink zones when offlining memory

David Hildenbrand david at redhat.com
Thu Dec 19 04:08:04 AEDT 2019


On 01.12.19 00:21, Andrew Morton wrote:
> On Sun, 27 Oct 2019 23:45:52 +0100 David Hildenbrand <david at redhat.com> wrote:
> 
>> I think I just found an issue with try_offline_node(). 
>> try_offline_node() is pretty much broken already (touches garbage 
>> memmaps and will not considers mixed NIDs within sections), however, 
>> relies on the node span to look for memory sections to probe. So it 
>> seems to rely on the nodes getting shrunk when removing memory, not when 
>> offlining.
>>
>> As we shrink the node span when offlining now and not when removing, 
>> this can go wrong once we offline the last memory block of the node and 
>> offline the last CPU. We could still have memory around that we could 
>> re-online, however, the node would already be offline. Unlikely, but 
>> possible.
>>
>> Note that the same is also broken without this patch in case memory is 
>> never onlined. The "pfn_to_nid(pfn) != nid" can easily succeed on the 
>> garbage memmap, resulting in  no memory being detected as belonging to 
>> the node. Also, resize_pgdat_range() is called when onlining memory, not 
>> when adding it. :/ Oh this is so broken :)
>>
>> The right fix is probably to walk over all memory blocks that could 
>> exist and test if they belong to the nid (if offline, check the 
>> block->nid, if online check all pageblocks). A fix we can then move in 
>> front of this patch.
>>
>> Will look into this this week.
> 
> And this series shows almost no sign of having been reviewed.  I'll hold
> it over for 5.6.
> 

Hi Andrew, any chance we can get the (now at least reviewed - thx Oscar)
fix in patch #5 into 5.5? (I want to do the final stable backports for
the uninitialized memmap stuff)

-- 
Thanks,

David / dhildenb



More information about the Linuxppc-dev mailing list