[PATCH] slub: Don't throw away partial remote slabs if there is no local memory
Wanpeng Li
liwanp at linux.vnet.ibm.com
Tue Jan 7 20:21:45 EST 2014
On Tue, Jan 07, 2014 at 06:10:16PM +0900, Joonsoo Kim wrote:
>On Tue, Jan 07, 2014 at 04:48:40PM +0800, Wanpeng Li wrote:
>> Hi Joonsoo,
>> On Tue, Jan 07, 2014 at 04:41:36PM +0900, Joonsoo Kim wrote:
>> >On Tue, Jan 07, 2014 at 01:21:00PM +1100, Anton Blanchard wrote:
>> >>
>> [...]
>> >Hello,
>> >
>> >I think that we need more efforts to solve unbalanced node problem.
>> >
>> >With this patch, even if node of current cpu slab is not favorable to
>> >unbalanced node, allocation would proceed and we would get the unintended memory.
>> >
>>
>> We have a machine:
>>
>> [ 0.000000] Node 0 Memory:
>> [ 0.000000] Node 4 Memory: 0x0-0x10000000 0x20000000-0x60000000 0x80000000-0xc0000000
>> [ 0.000000] Node 6 Memory: 0x10000000-0x20000000 0x60000000-0x80000000
>> [ 0.000000] Node 10 Memory: 0xc0000000-0x180000000
>>
>> [ 0.041486] Node 0 CPUs: 0-19
>> [ 0.041490] Node 4 CPUs:
>> [ 0.041492] Node 6 CPUs:
>> [ 0.041495] Node 10 CPUs:
>>
>> The pages of current cpu slab should be allocated from fallback zones/nodes
>> of the memoryless node in buddy system, how can not favorable happen?
>
>Hi, Wanpeng.
>
>IIRC, if we call kmem_cache_alloc_node() with certain node #, we try to
>allocate the page in fallback zones/node of that node #. So fallback list isn't
>related to fallback one of memoryless node #. Am I wrong?
>
Anton add node_spanned_pages(node) check, so current cpu slab mentioned
above is against memoryless node. If I miss something?
Regards,
Wanpeng Li
>Thanks.
>
>>
>> >And there is one more problem. Even if we have some partial slabs on
>> >compatible node, we would allocate new slab, because get_partial() cannot handle
>> >this unbalance node case.
>> >
>> >To fix this correctly, how about following patch?
>> >
>>
>> So I think we should fold both of your two patches to one.
>>
>> Regards,
>> Wanpeng Li
>>
>> >Thanks.
>> >
>> >------------->8--------------------
>> >diff --git a/mm/slub.c b/mm/slub.c
>> >index c3eb3d3..a1f6dfa 100644
>> >--- a/mm/slub.c
>> >+++ b/mm/slub.c
>> >@@ -1672,7 +1672,19 @@ static void *get_partial(struct kmem_cache *s, gfp_t flags, int node,
>> > {
>> > void *object;
>> > int searchnode = (node == NUMA_NO_NODE) ? numa_node_id() : node;
>> >+ struct zonelist *zonelist;
>> >+ struct zoneref *z;
>> >+ struct zone *zone;
>> >+ enum zone_type high_zoneidx = gfp_zone(flags);
>> >
>> >+ if (!node_present_pages(searchnode)) {
>> >+ zonelist = node_zonelist(searchnode, flags);
>> >+ for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
>> >+ searchnode = zone_to_nid(zone);
>> >+ if (node_present_pages(searchnode))
>> >+ break;
>> >+ }
>> >+ }
>> > object = get_partial_node(s, get_node(s, searchnode), c, flags);
>> > if (object || node != NUMA_NO_NODE)
>> > return object;
>> >
>> >--
>> >To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> >the body to majordomo at kvack.org. For more info on Linux MM,
>> >see: http://www.linux-mm.org/ .
>> >Don't email: <a href=mailto:"dont at kvack.org"> email at kvack.org </a>
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo at kvack.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont at kvack.org"> email at kvack.org </a>
More information about the Linuxppc-dev
mailing list