MCE handler gets NIP wrong on MPC8378

Christophe Leroy christophe.leroy at c-s.fr
Thu Feb 20 08:08:29 AEDT 2020


Radu Rendec <radu.rendec at gmail.com> a écrit :

> On 02/19/2020 at 10:11 AM Radu Rendec <radu.rendec at gmail.com> wrote:
>> On 02/18/2020 at 1:08 PM Christophe Leroy <christophe.leroy at c-s.fr> wrote:
>> > Le 18/02/2020 à 18:07, Radu Rendec a écrit :
>> > > The saved NIP seems to be broken inside machine_check_exception() on
>> > > MPC8378, running Linux 4.9.191. The value is 0x900 most of the times,
>> > > but I have seen other weird values.
>> > >
>> > > I've been able to track down the entry code to head_32.S (vector 0x200),
>> > > but I'm not sure where/how the NIP value (where the exception occurred)
>> > > is captured.
>> >
>> > NIP value is supposed to come from SRR0, loaded in r12 in PROLOG_2 and
>> > saved into _NIP(r11) in transfer_to_handler in entry_32.S
>> >
>> > Can something clobber r12 at some point ?
>> >
>>
>> I did something even simpler: I added the following
>>
>>       lis r12,0x1234
>>
>> ... right after
>>
>>       mfspr r12,SPRN_SRR0
>>
>> ... and now the NIP value I see in the crash dump is 0x12340000. This
>> means r12 is not clobbered and most likely the NIP value I normally see
>> is the actual SRR0 value.
>
> I apologize for the noise. I just found out accidentally that the saved
> NIP value is correct if interrupts are disabled at the time when the
> faulty access that triggers the MCE occurs. This seems to happen
> consistently.
>
> By "interrupts are disabled" I mean local_irq_save/local_irq_restore, so
> it's basically enough to wrap ioread32 to get the NIP value right.
>
> Does this make any sense? Maybe it's not a silicon bug after all, or
> maybe it is and I just found a workaround. Could this happen on other
> PowerPC CPUs as well?

Interesting.

0x900 is the adress of the timer interrupt.

Would the MCE occur just after the timer interrupt ?

Can you tell how are configured your IO busses, etc ... ?

Christophe




More information about the Linuxppc-dev mailing list