[PATCH] mm: numa: bugfix for LAST_CPUPID_NOT_IN_PAGE_FLAGS
liu ping fan
qemulist at gmail.com
Fri Feb 28 17:36:18 EST 2014
On Fri, Feb 28, 2014 at 12:47 PM, Aneesh Kumar K.V
<aneesh.kumar at linux.vnet.ibm.com> wrote:
> Andrew Morton <akpm at linux-foundation.org> writes:
>
>> On Wed, 26 Feb 2014 13:22:16 +0530 "Aneesh Kumar K.V" <aneesh.kumar at linux.vnet.ibm.com> wrote:
>>
>>> Andrew Morton <akpm at linux-foundation.org> writes:
>>>
>>> > On Wed, 5 Feb 2014 09:25:46 +0800 Liu Ping Fan <qemulist at gmail.com> wrote:
>>> >
>>> >> When doing some numa tests on powerpc, I triggered an oops bug. I find
>>> >> it is caused by using page->_last_cpupid. It should be initialized as
>>> >> "-1 & LAST_CPUPID_MASK", but not "-1". Otherwise, in task_numa_fault(),
>>> >> we will miss the checking (last_cpupid == (-1 & LAST_CPUPID_MASK)).
>>> >> And finally cause an oops bug in task_numa_group(), since the online cpu is
>>> >> less than possible cpu.
>>> >
>>> > I grabbed this. I added this to the changelog:
>>> >
>>> > : PPC needs the LAST_CPUPID_NOT_IN_PAGE_FLAGS case because ppc needs to
>>> > : support a large physical address region, up to 2^46 but small section size
>>> > : (2^24). So when NR_CPUS grows up, it is easily to cause
>>> > : not-in-page-flags.
>>> >
>>> > to hopefully address Peter's observation.
>>> >
>>> > How should we proceed with this? I'm getting the impression that numa
>>> > balancing on ppc is a dead duck in 3.14, so perhaps this and
>>> >
>>> > powerpc-mm-add-new-set-flag-argument-to-pte-pmd-update-function.patch
>>> > mm-dirty-accountable-change-only-apply-to-non-prot-numa-case.patch
>>> > mm-use-ptep-pmdp_set_numa-for-updating-_page_numa-bit.patch
>>> >
>>>
>>> All these are already in 3.14 ?
>>
>> Yes.
>>
>>> > are 3.15-rc1 material?
>>> >
>>>
>>> We should push the first hunk to 3.14. I will wait for Liu to redo the
>>> patch. BTW this should happen only when SPARSE_VMEMMAP is not
>>> specified. Srikar had reported the issue here
>>>
>>> http://mid.gmane.org/20140219180200.GA29257@linux.vnet.ibm.com
>>>
>>> #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
>>> #define SECTIONS_WIDTH SECTIONS_SHIFT
>>> #else
>>> #define SECTIONS_WIDTH 0
>>> #endif
>>>
>>
>> I'm lost. What patch are you talking about? The first hunk of what?
>
> The patch in this thread.
>
>>
>> I assume we're talking about
>> mm-numa-bugfix-for-last_cpupid_not_in_page_flags.patch, which I had
>> queued for 3.14. I'll put it on hold until there's some clarity here.
>
> We don't need the complete patch, it is just the first hunk that we need
> to fix the crash ie. we only need
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a7b4e31..ddc66df4 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -727,7 +727,7 @@ static inline int page_cpupid_last(struct page *page)
> }
> static inline void page_cpupid_reset_last(struct page *page)
> {
> - page->_last_cpupid = -1;
> + page->_last_cpupid = -1 & LAST_CPUPID_MASK;
> }
> #else
> static inline int page_cpupid_last(struct page *page)
>
> Also the issue will only happen when SPARSE_VMEMMAP is not enabled. I
> will send a proper patch with updated changelog. I was hoping Liu will
> get to that quickly
>
Thanks for sending V2. Since the ppc machine env is changed by
others, I am blocking on setting up the env for re-test this patch.
And not send out it quickly.
Best regards,
Fan
>
> -aneesh
>
More information about the Linuxppc-dev
mailing list