Need help with using the BDI2K to debug exception handlers
Bruce_Leonard at selinc.com
Bruce_Leonard at selinc.com
Thu Mar 11 12:27:34 EST 2010
Hi all,
Okay, I'm putting on my asbestos underwear and hoping I don't sound too
stupid. Here's my sitch: we're seeing an illegal instruction exception,
but the tracks our diagnostic code we put into the kernel
program_check_exception() function claims the instruction is perfectly
good. So I want to use the BDI to set a BP in the program exception and
poke around at a HW level rather than a SW level that has gone through an
unknown number of context switches.
Now I know that using the BDI in exceptions is hard to do for lots of
reasons, first and foremost among them being the fact that the BDI uses
SRR0/1 for it's own purposes. I've been down this path before and know
there's problems. But what I'm seeing is even stranger than usual.
I've replaced the program exception code in arch/powerpc/kernel/head_32.S
with the following:
. = 0x700
ProgramCheck:
mtspr SPRN_SPRG1,r9
mtspr SPRN_SPRG2,r10
mtspr SPRN_SPRG7,r3
mfspr r9,SPRN_SRR0
mfspr r10,SPRN_SRR1
andis. r3,r10,0x0008 /* is it an illegal instruction?
*/
beq 1f /* no so continue */
2: xor r3,r3,r3 /* dummy instruction */
b 2b /* loop forever */
1: mfspr r10,SPRN_SPRG2
mfspr r9,SPRN_SPRG1
EXCEPTION_PROLOG;
addi r3,r1,STACK_FRAME_OVERHEAD;
EXC_XFER_STD(0x700, program_check_exception);
(Before everyone flames me, yes I know there's a bug, I didn't restore r3
before continuing to the program_check_exception; it's immaterial to the
problem at hand because I don't really care if I ever successfully get
into program_check_exception.) The purpose of all this is to save SRR0/1
into GPRs so the BDI doesn't whack them, check to see if the exception is
being call because of an illegal instruction, continue on if not, and
provide a place to hang a breakpoint if it is an illegal instruction. So
I load up this code, connect to the BDI, set a HW BP on the branch
instruction following the line labeled '2', tell it to go and sit back to
wait.
Eventually our problem occurs and the BDI says "TARGET: stopped" or some
such, indicating it's hit the breakpoint. This is where things get
strange and I need help. At this point the BDI output from the telnet
session says the debug entry cause is <unknown 0> and the current PC is
0x6fc, one instruction before the program exception. When I dump the
registers r9 and r10 contain nothing the even remotely resemble SRR0/1.
The link register contains a valid _physical_ address (though I would
expect it to contain a virtual address from the last 'bl' instruction) but
when I dump the memory pointed to by LR it contains all zeros, not PPC
machine code. It looks like my code isn't even running even though it
seems I've hit the breakpoint. It's almost as if the BDI recognizes I'm
entering an exception that I've set a BP in and halts just before
executing the exception code. I'm not sure I believe it, but that's how
it appears.
Has anyone seen this or have any suggestion on how I can get the BDI to
quit 'helping' me and just stop where I tell it to in an exception
handler?
Thanks
Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20100310/cc402a00/attachment.htm>
More information about the Linuxppc-dev
mailing list