[PATCH] kthread: kthread_bind fails to enforce CPU affinity (fixes kernel BUG at kernel/smpboot.c:134!)

Ingo Molnar mingo at kernel.org
Mon Dec 8 19:34:08 AEDT 2014


* Anton Blanchard <anton at samba.org> wrote:

> I have a busy ppc64le KVM box where guests sometimes hit the 
> infamous "kernel BUG at kernel/smpboot.c:134!" issue during 
> boot:
> 
> BUG_ON(td->cpu != smp_processor_id());
> 
> Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
> output confirms it:
> 
> CPU: 0
> Comm: watchdog/130
> 
> The issue is in kthread_bind where we set the cpus_allowed 
> mask, but do not touch task_thread_info(p)->cpu. The scheduler 
> assumes the previously scheduled CPU is in the cpus_allowed 
> mask, but in this case we are moving a thread to another CPU so 
> it is not.
> 
> We used to call set_task_cpu which sets 
> task_thread_info(p)->cpu (in fact kthread_bind still has a 
> comment suggesting this). That was removed in e2912009fb7b 
> ("sched: Ensure set_task_cpu() is never called on blocked 
> tasks").
> 
> Since we cannot call set_task_cpu (the task is in a sleeping 
> state), just do an explicit set of task_thread_info(p)->cpu.

So we cannot call set_task_cpu() because in the normal life time 
of a task the ->cpu value gets set on wakeup. So if a task is 
blocked right now, and its affinity changes, it ought to get a 
correct ->cpu selected on wakeup. The affinity mask and the 
current value of ->cpu getting out of sync is thus 'normal'.

(Check for example how set_cpus_allowed_ptr() works: we first set 
the new allowed mask, then do we migrate the task away if 
necessary.)

In the kthread_bind() case this is explicitly assumed: it only 
calls do_set_cpus_allowed().

But obviously the bug triggers in kernel/smpboot.c, and that 
assert shows a real bug - and your patch makes the assert go 
away, so the question is, how did the kthread get woken up and 
put on a runqueue without its ->cpu getting set?

One possibility is a generic scheduler bug in ttwu(), resulting 
in ->cpu not getting set properly. If this was the case then 
other places would be blowing up as well, and I don't think we 
are seeing this currently, especially not over such a long 
timespan.

Another possibility would be that kthread_bind()'s assumption 
that the task is inactive is false: if the task activates when we 
think it's blocked and we just hotplug-migrate it away while its 
running (setting its td->cpu?), the assert could trigger I think 
- and the patch would make the assert go away.

A third possibility would be, if this is a freshly created 
thread, some sort of initialization race - either in the kthread 
or in the scheduler code.

Weird.

Thanks,

	Ingo


More information about the Linuxppc-dev mailing list