Need help with using the BDI2K to debug exception handlers

Bruce_Leonard at selinc.com Bruce_Leonard at selinc.com
Thu Mar 11 12:27:34 EST 2010


Hi all,

Okay, I'm putting on my asbestos underwear and hoping I don't sound too 
stupid.  Here's my sitch: we're seeing an illegal instruction exception, 
but the tracks our diagnostic code we put into the kernel 
program_check_exception() function claims the instruction is perfectly 
good.  So I want to use the BDI to set a BP in the program exception and 
poke around at a HW level rather than a SW level that has gone through an 
unknown number of context switches.

Now I know that using the BDI in exceptions is hard to do for lots of 
reasons, first and foremost among them being the fact that the BDI uses 
SRR0/1 for it's own purposes.  I've been down this path before and know 
there's problems.  But what I'm seeing is even stranger than usual. 

I've replaced the program exception code in arch/powerpc/kernel/head_32.S 
with the following:

        . = 0x700
ProgramCheck:
        mtspr   SPRN_SPRG1,r9
        mtspr   SPRN_SPRG2,r10
        mtspr   SPRN_SPRG7,r3
        mfspr   r9,SPRN_SRR0
        mfspr   r10,SPRN_SRR1
        andis.  r3,r10,0x0008           /* is it an illegal instruction? 
*/
        beq     1f                      /* no so continue */
2:      xor     r3,r3,r3                        /* dummy instruction */
        b       2b                      /* loop forever */
1:      mfspr   r10,SPRN_SPRG2
        mfspr   r9,SPRN_SPRG1
        EXCEPTION_PROLOG;
        addi    r3,r1,STACK_FRAME_OVERHEAD;
        EXC_XFER_STD(0x700, program_check_exception);

(Before everyone flames me, yes I know there's a bug, I didn't restore r3 
before continuing to the program_check_exception; it's immaterial to the 
problem at hand because I don't really care if I ever successfully get 
into program_check_exception.)  The purpose of all this is to save SRR0/1 
into GPRs so the BDI doesn't whack them, check to see if the exception is 
being call because of an illegal instruction, continue on if not, and 
provide a place to hang a breakpoint if it is an illegal instruction.  So 
I load up this code, connect to the BDI, set a HW BP on the branch 
instruction following the line labeled '2', tell it to go and sit back to 
wait.

Eventually our problem occurs and the BDI says "TARGET: stopped" or some 
such, indicating it's hit the breakpoint.  This is where things get 
strange and I need help.  At this point the BDI output from the telnet 
session says the debug entry cause is <unknown 0> and the current PC is 
0x6fc, one instruction before the program exception.  When I dump the 
registers r9 and r10 contain nothing the even remotely resemble SRR0/1. 
The link register contains a valid _physical_ address (though I would 
expect it to contain a virtual address from the last 'bl' instruction) but 
when I dump the memory pointed to by LR it contains all zeros, not PPC 
machine code.  It looks like my code isn't even running even though it 
seems I've hit the breakpoint.  It's almost as if the BDI recognizes I'm 
entering an exception that I've set a BP in and halts just before 
executing the exception code.  I'm not sure I believe it, but that's how 
it appears.

Has anyone seen this or have any suggestion on how I can get the BDI to 
quit 'helping' me and just stop where I tell it to in an exception 
handler?

Thanks

Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20100310/cc402a00/attachment.htm>


More information about the Linuxppc-dev mailing list