debug problems on ppc 83xx target due to changed struct task_struct

Benjamin Herrenschmidt benh at kernel.crashing.org
Wed Aug 17 08:13:52 AEST 2016


On Mon, 2016-08-15 at 09:19 -0700, Dave Hansen wrote:
> 
> Wow, thanks for all the debugging here!

Yup, thanks, that's really odd... I wonder if one of those
structures is accessed beyond it's boundary, either the sigset
or the thread struct, causing corruption of neighbouring fields
in task struct...

Can you try adding a little canary on both sides (make it not-so-little 
maybe a few words) which you initialize to a known pattern and check
every now and then ?

> So, we know it has to do with signals, thread_info, and probably only
> affects 32-bit powerpc.  Seems awfully weird.  Have you checked with
> any
> of the 64-bit powerpc guys to see if they have any ideas?
> 
> I went grepping around for a bit.
> 
> Where is the task_struct stored?  Is it on-stack on ppc32 or
> something?

No it's allocated normally.

>  The thread_info is,

Yes, thread_info is at the bottom of stack

>  I assume, but I see some THREAD_INFO vs. THREAD
> (thread struct) math happening in here, which confuses me:
>  
>         .globl  ret_from_debug_exc
> ret_from_debug_exc:
>         mfspr   r9,SPRN_SPRG_THREAD
>         lwz     r10,SAVED_KSP_LIMIT(r1)
>         stw     r10,KSP_LIMIT(r9)
>         lwz     r9,THREAD_INFO-THREAD(r9)

This calculates the offset between the thread struct and the pointer
to thread info inside task struct and loads that pointer into r9

>         CURRENT_THREAD_INFO(r10, r1)
>         lwz     r10,TI_PREEMPT(r10)
>         stw     r10,TI_PREEMPT(r9)
>         RESTORE_xSRR(SRR0,SRR1);
>         RESTORE_xSRR(CSRR0,CSRR1);
>         RESTORE_MMU_REGS;
>         RET_FROM_EXC_LEVEL(SPRN_DSRR0, SPRN_DSRR1, PPC_RFDI)

Basically the above code transfers TI_PREEMPT from the "current"
thread info which I believe would be on some exception/interrupt
stack into the current task thread info.

> But, I'm really at a loss to explain this.  It still seems like a
> deeply
> ppc-specific issue.  We can obviously work around it with an #ifdef
> for
> your platform, but that's awfully hackish and hides the real bug,
> whatever it is.
> 
> My suspicion is that there's a bug in the 32-bit ppc assembly
> somewhere.
>  I don't see any references to 'blocked' or 'real_blocked' in
> assembly
> though.  You could add a bunch of padding instead of moving the
> thread_struct and see if that does anything, but that's really a stab
> in
> the dark.


More information about the Linuxppc-dev mailing list