[PATCH V4] powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx
B38951 at freescale.com
Tue Mar 5 21:12:30 EST 2013
> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Tuesday, March 05, 2013 7:46 AM
> To: Stuart Yoder
> Cc: Jia Hongtao-B38951; linuxppc-dev at lists.ozlabs.org; Kumar Gala
> Subject: Re: [PATCH V4] powerpc/85xx: Add machine check handler to fix
> PCIe erratum on mpc85xx
> On 03/04/2013 10:16:10 AM, Stuart Yoder wrote:
> > On Mon, Mar 4, 2013 at 2:40 AM, Jia Hongtao <B38951 at freescale.com>
> > wrote:
> > > A PCIe erratum of mpc85xx may causes a core hang when a link of PCIe
> > > goes down. when the link goes down, Non-posted transactions issued
> > > via the ATMU requiring completion result in an instruction stall.
> > > At the same time a machine-check exception is generated to the core
> > > to allow further processing by the handler. We implements the
> > handler
> > > which skips the instruction caused the stall.
> > Can you explain at a high level how just skipping an instruction
> > solves
> > anything? If you just skip a load/store and continue like nothing is
> > wrong, isn't your system possibly in a really bad state.
> If the instruction was a load, we probably at least want to fill the
> destination register with 0xffffffff or similar.
You discuss this with Liu Shuo about a year ago.
here is the log:
On 02/01/2012 02:18 AM, shuo.liu at freescale.com wrote:
> v3 : Skip the instruction only. Don't access the user space memory in
> mechine check.
It may be the least bad option for now, but be aware that there's a
small chance that this will cause a leak of sensitive information (such
as a piece of a crypto key that happened to be sitting in the register
to be loaded into).
> > And if the core is already hung, due to the PCI link going down, isn't
> > it too late? How does skipping help?
> Maybe the machine check unhangs the core?
> Is there an erratum number for this?
More information about the Linuxppc-dev