[PATCH RFC v1] mm: is_mem_section_removable() overhaul

David Hildenbrand david at redhat.com
Wed Jan 22 21:39:08 AEDT 2020


>>> Really, the interface is flawed and should have never been merged in the
>>> first place. We cannot simply remove it altogether I am afraid so let's
>>> at least remove the bogus code and pretend that the world is a better
>>> place where everything is removable except the reality sucks...
>>
>> As I expressed already, the interface works as designed/documented and
>> has been used like that for years.
> 
> It seems we do differ in the usefulness though. Using a crappy interface
> for years doesn't make it less crappy. I do realize we cannot remove the
> interface but we can remove issues with the implementation and I dare to
> say that most existing users wouldn't really notice.

Well, at least powerpc-utils (why this interface was introduced) will
notice a) performance wise and b) because more logging output will be
generated (obviously non-offlineable blocks will be tried to offline).

However, it should not break, because we could have had races
before/false positives.

> 
>> I tend to agree that it never should have been merged like that.
>>
>> We have (at least) two places that are racy (with concurrent memory
>> hotplug):
>>
>> 1. /sys/.../memoryX/removable
>> - a) make it always return yes and make the interface useless
>> - b) add proper locking and keep it running as is (e.g., so David can
>>      identify offlineable memory blocks :) ).
>>
>> 2. /sys/.../memoryX/valid_zones
>> - a) always return "none" if the memory is online
>> - b) add proper locking and keep it running as is
>> - c) cache the result ("zone") when a block is onlined (e.g., in
>> mem->zone. If it is NULL, either mixed zones or unknown)
>>
>> At least 2. already scream for a proper device_lock() locking as the
>> mem->state is not stable across the function call.
>>
>> 1a and 2a are the easiest solutions but remove all ways to identify if a
>> memory block could theoretically be offlined - without trying
>> (especially, also to identify the MOVABLE zone).
>>
>> I tend to prefer 1b) and 2c), paired with proper device_lock() locking.
>> We don't affect existing use cases but are able to simplify the code +
>> fix the races.
>>
>> What's your opinion? Any alternatives?
> 
> 1a) and 2c) if you ask me.
> 

I'll look into that all, just might take a little (busy with a lot of
stuff). But after all, it does not seem to be urgent.

1a) will be easy, I'll post a patch soon that we can let rest in -next
for a bit to see if people start to scream out loud.

-- 
Thanks,

David / dhildenb



More information about the Linuxppc-dev mailing list