[PATCH 2.6.14] mm: 8xx MM fix for

Joakim Tjernlund joakim.tjernlund at transmode.se
Tue Nov 8 05:37:45 EST 2005


 > 
> On Mon, Nov 07, 2005 at 07:14:15PM +0100, Joakim Tjernlund wrote:
> > > -----Original Message-----
> > > From: Tom Rini [mailto:trini at kernel.crashing.org] 
> > > Sent: 07 November 2005 16:52
> > > To: Marcelo Tosatti
> > > Cc: Joakim Tjernlund; Pantelis Antoniou; Dan Malek; 
> > > linuxppc-embedded at ozlabs.org; gtolstolytkin at ru.mvista.com
> > > Subject: Re: [PATCH 2.6.14] mm: 8xx MM fix for
> > > 
> > > On Mon, Nov 07, 2005 at 08:16:18AM -0200, Marcelo Tosatti wrote:
> > > > Joakim!
> > > > 
> > > > On Mon, Nov 07, 2005 at 03:32:52PM +0100, Joakim 
> Tjernlund wrote:
> > > > > Hi Marcelo
> > > > > 
> > > > > [SNIP] 
> > > > > > The root of the problem are the changes against the 8xx TLB 
> > > > > > handlers introduced
> > > > > > during v2.6. What happens is the TLBMiss handlers load the 
> > > > > > zeroed pte into
> > > > > > the TLB, causing the TLBError handler to be invoked (thats 
> > > > > > two TLB faults per 
> > > > > > pagefault), which then jumps to the generic MM code to 
> > > setup the pte.
> > > > > > 
> > > > > > The bug is that the zeroed TLB is not invalidated (the 
> > > same reason
> > > > > > for the "dcbst" misbehaviour), resulting in infinite 
> > > TLBError faults.
> > > > > > 
> > > > > > Dan, I wonder why we just don't go back to v2.4 behaviour.
> > > > > 
> > > > > This is one reason why it is the way it is:
> > > > > 
> > > 
> http://ozlabs.org/pipermail/linuxppc-embedded/2005-January/016382.html
> > > > > This details are little fuzzy ATM, but I think the 
> reason for the
> > > > > current
> > > > > impl. was only that it was less intrusive to impl.
> > > > 
> > > > Ah, I see. I wonder if the bug is processor specific: we 
> > > don't have such
> > > > changes in our v2.4 tree and never experienced such problem.
> > > > 
> > > > It should be pretty easy to hit it right? (instruction 
> > > pagefaults should
> > > > fail).
> > > > 
> > > > Grigori, Tom, can you enlight us about the issue on the URL 
> > > above. How
> > > > can it be triggered?
> > > 
> > > So after looking at the code in 2.6.14 and current git, I 
> think the
> > > above URL isn't relevant, unless there was a change I 
> missed (which
> > > could totally be possible) that reverted the patch there and 
> > > fixed that
> > > issue in a different manner.  But since I didn't figure that 
> > > out until I
> > > had finished researching it again:
> > 
> > I wasn't clear enough. What I meant was that the above patch made me
> > think and
> > the result was that I came up with a simpler fix, the "two 
> exception"
> > fix that
> > is in current kernels. See
> > 
> http://linux.bkbits.net:8080/linux-2.6/diffs/arch/ppc/kernel/h
> ead_8xx.S@
> > 
> 1.19?nav=index.html|src/.|src/arch|src/arch/ppc|src/arch/ppc/k
> ernel|hist
> > /arch/ppc/kernel/head_8xx.S
> > It appears this fix has some other issues :(
> > 
> > How do the other ppc arches do? I am guessing that they don't double
> > fault, but bails
> > out to do_page_fault from the TLB Miss handler, like 8xx used to do.
> 
> Assuming Dan doesn't come up with a more simple & better fix, maybe we
> should go back to the original patch I made?

That was what I was thinking too(or some variation of your patch)
I wonder if that would solve the misbehaving dcbst problem Marcelo found
some time ago too?

 Jocke



More information about the Linuxppc-embedded mailing list