Bug in reclaim logic with exhausted nodes?

Nishanth Aravamudan nacc at linux.vnet.ibm.com
Tue May 13 04:46:54 EST 2014

Hi Christoph,

Sorry for the delay in my response!

On 03.04.2014 [11:41:37 -0500], Christoph Lameter wrote:
> On Mon, 31 Mar 2014, Nishanth Aravamudan wrote:
> > Yep. The node exists, it's just fully exhausted at boot (due to the
> > presence of 16GB pages reserved at boot-time).
> Well if you want us to support that then I guess you need to propose
> patches to address this issue.

Yep, that's my plan, I was hoping to get input from developers/experts
such as yourself first. Obviously, code speaks louder though...

> > I'd appreciate a bit more guidance? I'm suggesting that in this case
> > the node functionally has no memory. So the page allocator should
> > not allow allocations from it -- except (I need to investigate this
> > still) userspace accessing the 16GB pages on that node, but that, I
> > believe, doesn't go through the page allocator at all, it's all from
> > hugetlb interfaces. It seems to me there is a bug in SLUB that we
> > are noting that we have a useless per-node structure for a given
> > nid, but not actually preventing requests to that node or reclaim
> > because of those allocations.
> Well if you can address that without impacting the fastpath then we
> could do this. Otherwise we would need a fake structure here to avoid
> adding checks to the fastpath

Ok, I'll keep thinking about what makes the most sense.

> > I think there is a logical bug (even if it only occurs in this
> > particular corner case) where if reclaim progresses for a THISNODE
> > allocation, we don't check *where* the reclaim is progressing, and thus
> > may falsely be indicating that we have done some progress when in fact
> > the allocation that is causing reclaim will not possibly make any more
> > progress.
> Ok maybe we could address this corner case. How would you do this?

This is where I started to get stumped. It seems like did_some_progress
is only checking that any progress is made. It would be more expensive
in the reclaim path to check what nodes we made progress on and verify
it was on the intended one (if we are reclaiming due to THISNODE). I
will try and look at this case specifically more, I apologize it's
taking me quite a bit of time to get up-to-speed on the code and design.


More information about the Linuxppc-dev mailing list