[RFC PATCH] memory-hotplug: Use dev_online for memhp_auto_offline
Vitaly Kuznetsov
vkuznets at redhat.com
Fri Feb 24 05:14:27 AEDT 2017
Michal Hocko <mhocko at kernel.org> writes:
> On Thu 23-02-17 17:36:38, Vitaly Kuznetsov wrote:
>> Michal Hocko <mhocko at kernel.org> writes:
> [...]
>> > Is a grow from 256M -> 128GB really something that happens in real life?
>> > Don't get me wrong but to me this sounds quite exaggerated. Hotmem add
>> > which is an operation which has to allocate memory has to scale with the
>> > currently available memory IMHO.
>>
>> With virtual machines this is very real and not exaggerated at
>> all. E.g. Hyper-V host can be tuned to automatically add new memory when
>> guest is running out of it. Even 100 blocks can represent an issue.
>
> Do you have any reference to a bug report. I am really curious because
> something really smells wrong and it is not clear that the chosen
> solution is really the best one.
Unfortunately I'm not aware of any publicly posted bug reports (CC:
K. Y. - he may have a reference) but I think I still remember everything
correctly. Not sure how deep you want me to go into details though...
Virtual guests under stress were getting into OOM easily and the OOM
killer was even killing the udev process trying to online the
memory. There was a workaround for the issue added to the hyper-v driver
doing memory add:
hv_mem_hot_add(...) {
...
add_memory(....);
wait_for_completion_timeout(..., 5*HZ);
...
}
the completion was done by observing for the MEM_ONLINE event. This, of
course, was slowing things down significantly and waiting for a
userspace action in kernel is not a nice thing to have (not speaking
about all other memory adding methods which had the same issue). Just
removing this wait was leading us to the same OOM as the hypervisor was
adding more and more memory and eventually even add_memory() was
failing, udev and other processes were killed,...
With the feature in place we have new memory available right after we do
add_memory(), everything is serialized.
> [...]
>> > Because the udev will run a code which can cope with that - retry if the
>> > error is recoverable or simply report with all the details. Compare that
>> > to crawling the system log to see that something has broken...
>>
>> I don't know much about udev, but the most common rule to online memory
>> I've met is:
>>
>> SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
>>
>> doesn't do anything smart.
>
> So what? Is there anything that prevents doing something smarter?
Yes, the asynchronous nature of all this stuff. There is no way you can
stop other blocks from being added to the system while you're processing
something in userspace.
--
Vitaly
More information about the Linuxppc-dev
mailing list