[????] Re: [PATCH][v3] hung_task: Panic after fixed number of hung tasks
Li,Rongqing
lirongqing at baidu.com
Tue Oct 14 21:49:53 AEDT 2025
> On Tue 2025-10-14 13:23:58, Lance Yang wrote:
> > Thanks for the patch!
> >
> > I noticed the implementation panics only when N tasks are detected
> > within a single scan, because total_hung_task is reset for each
> > check_hung_uninterruptible_tasks() run.
>
> Great catch!
>
> Does it make sense?
> Is is the intended behavior, please?
>
Yes, this is intended behavior
> > So some suggestions to align the documentation with the code's
> > behavior below :)
>
> > On 2025/10/12 19:50, lirongqing wrote:
> > > From: Li RongQing <lirongqing at baidu.com>
> > >
> > > Currently, when 'hung_task_panic' is enabled, the kernel panics
> > > immediately upon detecting the first hung task. However, some hung
> > > tasks are transient and the system can recover, while others are
> > > persistent and may accumulate progressively.
>
> My understanding is that this patch wanted to do:
>
> + report even temporary stalls
> + panic only when the stall was much longer and likely persistent
>
> Which might make some sense. But the code does something else.
>
A single task hanging for an extended period may not be a critical issue, as users might still log into the system to investigate. However, if multiple tasks hang simultaneously-such as in cases of I/O hangs caused by disk failures-it could prevent users from logging in and become a serious problem, and a panic is expected.
> > > --- a/kernel/hung_task.c
> > > +++ b/kernel/hung_task.c
> > > @@ -229,9 +232,11 @@ static void check_hung_task(struct task_struct
> *t, unsigned long timeout)
> > > */
> > > sysctl_hung_task_detect_count++;
> > > + total_hung_task = sysctl_hung_task_detect_count -
> > > +prev_detect_count;
> > > trace_sched_process_hang(t);
> > > - if (sysctl_hung_task_panic) {
> > > + if (sysctl_hung_task_panic &&
> > > + (total_hung_task >= sysctl_hung_task_panic)) {
> > > console_verbose();
> > > hung_task_show_lock = true;
> > > hung_task_call_panic = true;
>
> I would expect that this patch added another counter, similar to
> sysctl_hung_task_detect_count. It would be incremented only once per check
> when a hung task was detected. And it would be cleared (reset) when no
> hung task was found.
>
> Best Regards,
> Petr
More information about the Linux-aspeed
mailing list