Critical Interrupt Input

Henry Bausley hbausley at deltatau.com
Wed Aug 21 08:48:33 EST 2013


Ben,


After your hints I suspected the read of a real world i/o variable *piom
which came from ioremap_nocache in the 3 line critical interrupt handler

void critintr_handler(void *dev)
{
  critintrcount++;          // increment a variable
  iodata = *piom;           // read an I/O location 
  mtdcr(0x0c0, 0x00002000); // clear critical interrupt 
} 

is what caused the problem. Commenting it out seems to make the system stable.  

This led us to disable the critical interrupt when in the
DataTLBError44x and InstructionTLBError44x exceptions.  Now the critical
interrupt handler seems to make things more stable when reading real
world i/o for our application.


  /* Data TLB Error Interrupt */
  START_EXCEPTION(DataTLBError44x)
  mtspr	SPRN_SPRG_WSCRATCH0, r10  /* Save some working */
+  mfmsr r10                      /*  Disable the */
+  rlwinm r10,r10,0,15,13         /*  MSR's CE bit */
+  mtmsr r10                     


Do you see any potential problems with this approach?

If so can you advise us on how to better take care of this.















On Tue, 2013-08-20 at 06:56 +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2013-08-19 at 12:00 -0700, Henry Bausley wrote:
> > 
> > Support does appear to be present but there is a problem returning
> > back to user space I suspect.
> 
> Probably a problem with TLB misses vs. crit interrupts.
> 
> A critical interrupt can re-enter a TLB miss.
> 
> I can see two potential issues there:
> 
>  - A bug where we don't properly restore "something" (I thought we did
> save and restore MMUCR though, but that's worth dbl checking if it works
> properly) accross the crit entry/exit
> 
>  - Something in your crit code causing a TLB miss (the
> kernel .text/.data/.bss should be bolted but anything else can). We
> don't currently support re-entering the TLB miss that way.
> 
> If we were to support the latter, we'd need to detect on entering a crit
> that the PC is within the TLB miss handler, and setup a return context
> to the original instruction (replay the miss) rather than trying to
> resume it..
> 
> Cheers,
> Ben.
> 
> > What fails is it causes Linux user space programs to get Segmentation
> > errors.
> > Issuing a simple ls causes a segmentation fault sometimes.  The shell
> > gets terminated 
> > and you cannot log back in.  INIT: Id "T0" respawning too fast:
> > disabled for 5 minutes pops up.
> > 
> > However, the critical interrupt handler keeps running.  I know this by
> > adding the reading 
> > of a physical I/O location in the handler and can see it is being read
> > on the scope.
> > 
> > 
> > The only code in the handler is below.
> > 
> > void critintr_handler(void *dev)
> > {
> >   critintrcount++;          // increment a variable
> >   iodata = *piom;           // read an I/O location 
> >   mtdcr(0x0c0, 0x00002000); // clear critical interrupt
> > }
> > 
> > 
> > Below is a log of the type of crashes that occur:
> > 
> > root at 10.34.9.213:/opt/ppmac/ktest# ls
> > Segmentation fault
> > root at 10.34.9.213:/opt/ppmac/ktest# ls
> > Segmentation fault
> > root at 10.34.9.213:/opt/ppmac/ktest# ls
> > Makefile        ktest.c    ktest.ko     ktest.mod.o  modules.order
> > Module.symvers  ktest.cbp  ktest.mod.c  ktest.o
> > root at 10.34.9.213:/opt/ppmac/ktest# ls
> > 
> > Debian GNU/Linux 7 powerpmac ttyS0
> > 
> > powerpmac login: root
> > 
> > Debian GNU/Linux 7 powerpmac ttyS0
> > 
> > powerpmac login: root
> > 
> > Debian GNU/Linux 7 powerpmac ttyS0
> > 
> > powerpmac login: root
> > 
> > Debian GNU/Linux 7 powerpmac ttyS0
> > 
> > powerpmac login: root
> > Password: 
> > Last login: Thu Nov 30 20:42:16 UTC 1933 on ttyS0
> > Linux powerpmac 3.2.21-aspen_2.01.09 #10 Mon Aug 19 08:49:12 PDT 2013
> > ppc
> > 
> > The programs included with the Debian GNU/Linux system are free
> > software;
> > the exact distribution terms for each program are described in the
> > individual files in /usr/share/doc/*/copyright.
> > 
> > Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
> > permitted by applicable law.
> > INIT: Id "T0" respawning too fast: disabled for 5 minutes
> > 
> > 
> > ______________________________________________________________________
> > From: "Benjamin Herrenschmidt" <benh at kernel.crashing.org>
> > Sent: Saturday, August 17, 2013 3:05 PM
> > To: "Kumar Gala" <galak at kernel.crashing.org>
> > Cc: linuxppc-dev at lists.ozlabs.org, hbausley at deltatau.com
> > Subject: Re: Critical Interrupt Input
> > 
> > On Fri, 2013-08-16 at 06:04 -0500, Kumar Gala wrote:
> > > The 44x low level code needs to handle exception stacks properly for
> > > this to work. Since its possible to have a critical exception occur
> > > while in a normal exception level, you have to have proper saving of
> > > additional register state and a stack frame for the critical
> > > exception, etc. I'm not sure if that was ever done for 44x.
> > 
> > Don't 44x and FSL BookE share the same macros ? I would think 44x does
> > indeed implement the same crit support as e500...
> > 
> > What does the crash look like ?
> > 
> > Ben.
> > 
> > 
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev at lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
> > 
> > 
> >   ­­  
> 
> 





Outbound scan for Spam or Virus by Barracuda at Delta Tau



More information about the Linuxppc-dev mailing list