[RFC PATCH] memory-hotplug: Use dev_online for memhp_auto_offline

Sat Feb 25 03:09:13 AEDT 2017

Michal Hocko <mhocko at kernel.org> writes:

> On Fri 24-02-17 16:05:18, Vitaly Kuznetsov wrote:
>> Michal Hocko <mhocko at kernel.org> writes:
>> 
>> > On Fri 24-02-17 15:10:29, Vitaly Kuznetsov wrote:
> [...]
>> >> Just did a quick (and probably dirty) test, increasing guest memory from
>> >> 4G to 8G (32 x 128mb blocks) require 68Mb of memory, so it's roughly 2Mb
>> >> per block. It's really easy to trigger OOM for small guests.
>> >
>> > So we need ~1.5% of the added memory. That doesn't sound like something
>> > to trigger OOM killer too easily. Assuming that increase is not way too
>> > large. Going from 256M (your earlier example) to 8G looks will eat half
>> > the memory which is still quite far away from the OOM.
>> 
>> And if the kernel itself takes 128Mb of ram (which is not something
>> extraordinary with many CPUs) we have zero left. Go to something bigger
>> than 8G and you die.
>
> Again, if you have 128M and jump to 8G then your memory balancing is
> most probably broken.
>

I don't understand what balancing you're talking about. I have a small
guest and I want to add more memory to it and the result is ... OOM. Not
something I expected.

>> > I would call such
>> > an increase a bad memory balancing, though, to be honest. A more
>> > reasonable memory balancing would go and double the available memory
>> > IMHO. Anway, I still think that hotplug is a terrible way to do memory
>> > ballooning.
>> 
>> That's what we have in *all* modern hypervisors. And I don't see why
>> it's bad.
>
> Go and re-read the original thread. Dave has given many good arguments.
>

Are we discussing taking away the memory hotplug feature from all
hypervisors here?

>> > Just make them all online the memory explicitly. I really do not see why
>> > this should be decided by poor user. Put it differently, when should I
>> > disable auto online when using hyperV or other of the mentioned
>> > technologies? CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE should simply die and
>> > I would even be for killing the whole memhp_auto_online thing along the
>> > way. This simply doesn't make any sense to me.
>> 
>> ACPI, for example, is shared between KVM/Qemu, Vmware and real
>> hardware. I can understand why bare metall guys might not want to have
>> auto-online by default (though, major linux distros ship the stupid
>> 'offline' -> 'online' udev rule and nobody complains) -- they're doing
>> some physical action - going to a server room, openning the box,
>> plugging in memory, going back to their place but with VMs it's not like
>> that. What's gonna be the default for ACPI then?
>> 
>> I don't understand why CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE is
>
> Because this is something a user has to think about and doesn't have a
> reasonable way to decide. Our config space is also way too large!

Config space is for distros, not users.

>
>> disturbing and why do we need to take this choice away from distros. I
>> don't understand what we're gaining by replacing it with
>> per-memory-add-technology defaults.
>
> Because those technologies know that they want to have the memory online
> as soon as possible. Jeez, just look at the hv code. It waits for the
> userspace to online memory before going further. Why would it ever want
> to have the tunable in "offline" state? This just doesn't make any
> sense. Look at how things get simplified if we get rid of this clutter

While this will most probably work for me I still disagree with the
concept of 'one size fits all' here and the default 'false' for ACPI,
we're taking away the feature from KVM/Vmware folks so they'll again
come up with the udev rule which has known issues.

[snip].

-- 
  Vitaly