[PATCH V4] powerpc/85xx: Add machine check handler to fix PCIe erratum on mpc85xx

Tue Apr 2 20:28:10 EST 2013

> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Saturday, March 30, 2013 12:34 AM
> To: Jia Hongtao-B38951
> Cc: Wood Scott-B07421; David Laight; linuxppc-dev at lists.ozlabs.org;
> Stuart Yoder
> Subject: Re: [PATCH V4] powerpc/85xx: Add machine check handler to fix
> PCIe erratum on mpc85xx
> 
> On 03/29/2013 03:03:51 AM, Jia Hongtao-B38951 wrote:
> >
> >
> > > -----Original Message-----
> > > From: Wood Scott-B07421
> > > Sent: Saturday, March 16, 2013 12:35 AM
> > > To: Jia Hongtao-B38951
> > > Cc: Wood Scott-B07421; David Laight; linuxppc-dev at lists.ozlabs.org;
> > > Stuart Yoder
> > > Subject: Re: [PATCH V4] powerpc/85xx: Add machine check handler to
> > fix
> > > PCIe erratum on mpc85xx
> > >
> > > On 03/14/2013 09:47:58 PM, Jia Hongtao-B38951 wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Wood Scott-B07421
> > > > > Sent: Thursday, March 14, 2013 12:38 AM
> > > > > To: David Laight
> > > > > Cc: Jia Hongtao-B38951; Wood Scott-B07421;
> > > > linuxppc-dev at lists.ozlabs.org;
> > > > > Stuart Yoder
> > > > > Subject: Re: [PATCH V4] powerpc/85xx: Add machine check handler
> > to
> > > > fix
> > > > > PCIe erratum on mpc85xx
> > > > >
> > > > > On 03/13/2013 04:40:40 AM, David Laight wrote:
> > > > > > > Hmm, seems there's no probe_user_address() -- for userspace
> > we
> > > > > > > basically want the same thing minus the KERNEL_DS.  See
> > > > > > > arch/powerpc/perf/callchain.c for an example.
> > > > > >
> > > > > > Isn't that just copy_from_user() ?
> > > > >
> > > > > Plus pagefault_disable/enable().
> > > > >
> > > > > -Scott
> > > >
> > > > pagefault_disable() is identical to preempt_disable(). So I think
> > this
> > > > could not avoid other cpu to swap out the instruction we want to
> > read
> > > > back.
> > > > probe_kernel_address() also have the same issue.
> > >
> > > That's not the point -- the point is to let the page fault handler
> > know
> > > that it should go directly to bad_page_fault().  Do not pass
> > > handle_mm_fault().  Do not collect a page from disk.
> > >
> > > Granted, we're already in atomic context which will have that effect
> > > due to being in the machine check handler, but it's better to be
> > > explicit about it and not depend on how pagefault_diasble() is
> > > implemented.
> > >
> > > -Scott
> >
> >
> > Based on the comments I updated the machine check handler.
> >
> > Changes from last version:
> > * Check MSR_GS state
> > * Check if the instruction is LD
> > * Handle the user space issue
> >
> > The updated machine check handler is as following:
> >
> > int fsl_pci_mcheck_exception(struct pt_regs *regs) {
> >         unsigned int op, rd;
> >         u32 inst;
> >         int ret;
> >         phys_addr_t addr = 0;
> >
> >         /* Let KVM/QEMU deal with the exception */
> >         if (regs->msr & MSR_GS)
> >                 return 0;
> >
> > #ifdef CONFIG_PHYS_64BIT
> >         addr = mfspr(SPRN_MCARU);
> >         addr <<= 32;
> > #endif
> >         addr += mfspr(SPRN_MCAR);
> >
> >         if (is_in_pci_mem_space(addr)) {
> >                 if (user_mode(regs)) {
> >                         pagefault_disable();
> >                         ret = copy_from_user(&(inst), (u32 __user
> > *)regs->nip, sizeof(inst));
> >                         pagefault_enable();
> 
> You could use get_user() instead of copy_from_user().

Got it.

> 
> >                 } else {
> >                         ret = probe_kernel_address(regs->nip, inst);
> >                 }
> >
> >                 op = get_op(inst);
> >                 /* Check if the instruction is LD */
> >                 if (!ret && (op == 111010)) {
> >                         rd = get_rt(inst);
> >                         regs->gpr[rd] = 0xffffffff;
> >                 }
> >
> >                 regs->nip += 4;
> >                 return 1;
> >         }
> >
> >         return 0;
> > }
> >
> > BTW, I'm still not sure how to deal with LD instruction with update.
> 
> You would need to do the update yourself.  Or just say that's a case you
> don't handle, and return 0.
> 
> Again, please check for the size of the load operation.
> 
> -Scott

For informing error to the process that hold the stall instruction
we need to do:
1. Verify the instruction is load.
2. Fill the rd register with ~0UL.
3. Deal with the load instruction with update.

Here is the problems:
1. So many load instructions to handle. There are dozens of load instructions
   and most of them with different op code. Like:

   lbz: 1 0 0 0 1 0
   lhz: 1 0 1 0 0 0
   lwz: 1 0 0 0 0 0
   ld : 1 1 1 0 1 0
   ...

   Is there any available API for verifying the load instruction?

2. For different size of load operation could we just fill the rd register with
   ~0UL?

3. A load instruction with update could not just verified by op code. I'd like
   to leave it along. I think we could not fix but just inform the error by
   filling the rd with ~0UL. Could you explain why should we care about the update?

The main problem is how to verifying the load instruction. I wonder if there is
an easy way to do it. If not the code here will be so complicated.

Thanks.
-Hongtao.