3.10-rc ppc64 corrupts usermem when swapping

Hugh Dickins hughd at google.com
Mon Jun 3 04:19:17 EST 2013


On Sun, 2 Jun 2013, Aneesh Kumar K.V wrote:
> Benjamin Herrenschmidt <benh at au1.ibm.com> writes:
> > On Fri, 2013-05-31 at 14:45 +0530, Aneesh Kumar K.V wrote:
> >
> >> > The patch you are running on is what I'll send to Linus for 3.10 (+/-
> >> > cosmetics). Aneesh second patch is a much larger rework which will be
> >> > needed for THP but that will wait for 3.11. I'm happy for you to test it
> >> > but I first want to make sure it's solid with the 3.10 fix :-)
> >
> > BTW. One concern I still have is that Hugh identified the bad commit
> > to be:
> >
> > 7e74c3921ad9610c0b49f28b8fc69f7480505841
> > "powerpc: Fix hpte_decode to use the correct decoding for page sizes".
> >
> > However, you introduce the return on HPTE not found earlier, in
> >
> > b1022fbd293564de91596b8775340cf41ad5214c
> > "powerpc: Decode the pte-lp-encoding bits correctly."
> >
> > So while I'm still happy with the current band-aid for 3.10 and am
> > about to send it to Linus, the above *does* seem to indicate that
> > there is also something wrong with the "Fix hpte_decode..." commit,
> > which might not actually get the page size right...
> >
> > Can you investigate ?
> 
> 7e74c3921ad9610c0b49f28b8fc69f7480505841 
> "powerpc: Fix hpte_decode to use the correct decoding for page sizes"
> changes should only impact hpte_decode. We don't change the details
> of hpte_actual_psize at all in this patch. That means we should see a
> difference only with kexec right ?.
> 
> Hugh,
> 
> Will you be able to double check whether
> 7e74c3921ad9610c0b49f28b8fc69f7480505841 is the bad commit. The one
> before that is what we changed in the patch that fixed your problem.

You are absolutely right.  I just set b1022fbd29 going, expecting
to answer you tomorrow: but got a Segmentation fault in 20 minutes
(quicker than ever seen before).  It looks as if I was running some
other kernel for the last stage of my bisection: I can't see how that
came about, but it's not very interesting now - you got it right.

Prior to trying that, I had been running your second patch, 9f70fd8cfe,
and that tested out successfully for 50 hours before I stopped it.

Hugh


More information about the Linuxppc-dev mailing list