[PATCH 0/8] 8xx: Misc fixes for buggy insn

Wed Nov 11 06:54:08 EST 2009

Scott Wood <scottwood at freescale.com> wrote on 10/11/2009 00:00:04:
>
> On Mon, Nov 09, 2009 at 03:53:21PM -0600, Scott Wood wrote:
> > On Fri, Nov 06, 2009 at 10:29:44AM +0100, Joakim Tjernlund wrote:
> > > > > With this, the kernel hangs after "Mount-cache hash table entries: 512".
> > > >
> > > > Somewhat surprising result. I didn't expect you would even hit this
> > > > condition now as we haven't enabled the use of dcbX insn yet.
> > > > The only thing I can think of is the you hit the 0x00f0 due to other
> > > > dcbX insn use and the kernel managed to fixup this in the page fault handler
> > > > by pure luck before.
> >
> > It's bizarre -- it still happens even if I revert the branch to FixupDAR.
> > However, if I comment out the final "b 151b", it stops happening.  It also
> > stops happening sometimes depending on where I stick printk()s to debug
> > this.
> >
> > So it may be an unrelated issue that just got perturbed somehow.
>
> OK, figured it out.  The fixup code pushed things around so that in
> syscall_exit_cont, SRR0/SRR1 were being loaded immediately prior to a page
> boundary, with the rfi after the page boundary.  On crossing the boundary,
> we take an ITLB miss (which goes from possibility to certainty with the
> CPU15 workaround), and SRR0/SRR1 get clobbered.

I think I have misunderstood, its is not CPU15 or 8xx problem per se, it
is a general problem that could happen to any ppc CPU, right?
8xx just happen to be the first CPU that hits this case due to my
DAR fixing

>
> I suppose we'll need to find all places where we do rfi with the MMU
> enabled, and ensure acceptable alignment. :-(

That seems the right fix, the asm in entry_32.S needs have some alignment
here and there to make sure you don't cross a page boundary at the wrong time.