[PATCH v6 00/15] memory-hotplug: hot-remove physical memory

Thu Jan 10 19:23:07 EST 2013

(2013/01/10 16:55), Glauber Costa wrote:
> On 01/10/2013 11:31 AM, Kamezawa Hiroyuki wrote:
>> (2013/01/10 16:14), Glauber Costa wrote:
>>> On 01/10/2013 06:17 AM, Tang Chen wrote:
>>>>>> Note: if the memory provided by the memory device is used by the
>>>>>> kernel, it
>>>>>> can't be offlined. It is not a bug.
>>>>>
>>>>> Right.  But how often does this happen in testing?  In other words,
>>>>> please provide an overall description of how well memory hot-remove is
>>>>> presently operating.  Is it reliable?  What is the success rate in
>>>>> real-world situations?
>>>>
>>>> We test the hot-remove functionality mostly with movable_online used.
>>>> And the memory used by kernel is not allowed to be removed.
>>>
>>> Can you try doing this using cpusets configured to hardwall ?
>>> It is my understanding that the object allocators will try hard not to
>>> allocate anything outside the walls defined by cpuset. Which means that
>>> if you have one process per node, and they are hardwalled, your kernel
>>> memory will be spread evenly among the machine. With a big enough load,
>>> they should eventually be present in all blocks.
>>>
>>
>> I'm sorry I couldn't catch your point.
>> Do you want to confirm whether cpuset can work enough instead of
>> ZONE_MOVABLE ?
>> Or Do you want to confirm whether ZONE_MOVABLE will not work if it's
>> used with cpuset ?
>>
>>
> No, I am not proposing to use cpuset do tackle the problem. I am just
> wondering if you would still have high success rates with cpusets in use
> with hardwalls. This is just one example of a workload that would spread
> kernel memory around quite heavily.
>
> So this is just me trying to understand the limitations of the mechanism.
>

Hm, okay. In my undestanding, if the whole memory of a node is configured as
MOVABLE, no kernel memory will not be allocated in the node because zonelist
will not match. So, if cpuset is used with hardwalls, user will see -ENOMEM or OOM,
I guess. even fork() will fail if fallback-to-other-node is not allowed.

If it's configure as ZONE_NORMAL, you need to pray for offlining memory.

AFAIK, IBM's ppc? has 16MB section size. So, some of sections can be offlined
even if they are configured as ZONE_NORMAL. For them, placement of offlined
memory is not important because it's virtualized by LPAR, they don't try
to remove DIMM, they just want to increase/decrease amount of memory.
It's an another approach.

But here, we(fujitsu) tries to remove a system board/DIMM.
So, configuring the whole memory of a node as ZONE_MOVABLE and tries to guarantee
DIMM as removable.

>> IMHO, I don't think shrink_slab() can kill all objects in a node even
>> if they are some caches. We need more study for doing that.
>>
>
> Indeed, shrink_slab can only kill cached objects. They, however, are
> usually a very big part of kernel memory. I wonder though if in case of
> failure, it is worth it to try at least one shrink pass before you give up.
>

Yeah, now, his (our) approach is never allowing kernel memory on a node to be
hot-removed by ZONE_MOVABLE. So, shrink_slab()'s effect will not be seen.

If other brave guys tries to use ZONE_NORMAL for hot-pluggable DIMM, I see,
it's worth triying.

How about checking the target memsection is in NORMAL or in MOVABLE at
hot-removing ? If NORMAL, shrink_slab() will be worth to be called.

BTW, shrink_slab() is now node/zone aware ? If not, fixing that first will
be better direction I guess.

Thanks,
-Kame