debug problems on ppc 83xx target due to changed struct task_struct
Holger Brunck
holger.brunck at keymile.com
Tue Aug 16 00:35:47 AEST 2016
On 12/08/16 18:09, Dave Hansen wrote:
> On 08/12/2016 08:47 AM, Holger Brunck wrote:
>> On 12/08/16 17:14, Dave Hansen wrote:
>>> On 08/12/2016 07:50 AM, Holger Brunck wrote:
>>>> When I try to debug our multithreaded userspace application with gdb I get
>>>> stuck when trying to single step code.
>>>
>>> Can you clarify "stuck"? Like the instructions don't advance? Have you
>>> been able to find a root cause for this?
>>
>> the behaviour is slightly different on the kernel versions. So my setup is a
>> remote debug session via gdbserver.
>>
>> After connecting to the gdbserver I set a break point and start to run my
>> program. When hitting the breakpoint I try to single step. With stuck I mean
>> that the connection to the gdbserver is broken and I can't control my debug
>> session anymore while the application is not continuing.
>
> Could you try debugging locally with gdb? It would be nice to take all
> the stuff involved with remote debugging out of the picture.
>
I tried this but unfortunately the error only occurs while remote debugging.
Locally with gdb everything works fine. BTW we double-checked with a 85xx ppc
target which is also 32-bit and it ends up with the same behaviour.
I was also investigating where I have to move the line in the struct task_struct
and it turns out to be like this (diff to 4.7 kernel):
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 253538f..4868874 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1655,7 +1655,9 @@ struct task_struct {
struct signal_struct *signal;
struct sighand_struct *sighand;
+ // struct thread_struct thread; // until here everything is fine
sigset_t blocked, real_blocked;
+ struct thread_struct thread; // from here it's broken
sigset_t saved_sigmask; /* restored if set_restore_sigmask() was used */
struct sigpending pending;
@@ -1919,7 +1921,6 @@ struct task_struct {
struct task_struct *oom_reaper_list;
#endif
/* CPU-specific state of this task */
- struct thread_struct thread;
/*
So it's in the area where some signal information are stored, which makes sense
because this is highly used in case of gdb debugging.
> Have you tried turning on a bunch of kernel debugging (SLAB/SLUB
> debugging, pagealloc debug, lockdep, etc...)? If something is getting
> corrupted, those tend to catch it.
>
I switched on some memory debugging features but didn't get suspicious output.
To make the situation even more weird after enabling FTRACE in the kernel to
trace some signal code the error disappeared.
>
> Is the process still alive at the point that the remote debugger stops
> responding? What is it doing at that point?
>
the process is still alive. The state of the process, it's threads and the
gdbserver is like this:
Bad case after a single step:
73 73 TS - 0 19 0 0.3 S sigsuspend gdbserver
74 74 TS - 0 19 0 0.0 tl+ ptrace_stop infra_pbec83xx_
74 77 IDL 0 - 19 0 0.0 tl+ ptrace_stop TR_Task
74 78 IDL 0 - 19 0 0.0 tl+ ptrace_stop TR_Timeout
74 79 TS - 0 19 0 0.0 tl+ poll_schedule_ timed_msg
74 80 IDL 0 - 19 0 0.0 tl+ ptrace_stop stimuli
74 81 TS - -5 24 0 0.0 t<l+ ptrace_stop timer0Dflt
74 82 TS - -19 38 0 0.0 t<l+ futex_wait_que timerUpd0
74 83 TS - -19 38 0 0.0 t<l+ timerfd_read timerClk
74 84 TS - -19 38 0 0.0 t<l+ ptrace_stop b/beatWDogRefr
Good case after a single step:
76 76 TS - 0 19 0 4.0 S poll_schedule_ gdbserver
77 77 TS - 0 19 0 0.0 tl ptrace_stop infra_pbec83xx_
77 84 IDL 0 - 19 0 0.0 tl ptrace_stop TR_Task
77 85 IDL 0 - 19 0 0.0 tl ptrace_stop TR_Timeout
77 86 TS - 0 19 0 0.0 tl ptrace_stop timed_msg
77 87 IDL 0 - 19 0 0.0 tl ptrace_stop stimuli
77 88 TS - -5 24 0 0.0 t<l ptrace_stop timer0Dflt
77 89 TS - -19 38 0 0.0 t<l ptrace_stop timerUpd0
77 90 TS - -19 38 0 0.0 t<l ptrace_stop timerClk
77 91 TS - -19 38 0 0.0 t<l ptrace_stop b/beatWDogRefr
So in the error case only some threads are at ptrace_stop, while all of them
should be after a single step with the gdb. So it's somewhere in the signal
handling between kernel and gdbserver.
Best regards
Holger Brunck
More information about the Linuxppc-dev
mailing list