[PATCH 1/2] sched: Feature to decide if steal should update CPU capacity

Srikar Dronamraju srikar at linux.ibm.com
Tue Oct 28 22:42:07 AEDT 2025


* Peter Zijlstra <peterz at infradead.org> [2025-10-28 12:18:13]:

> On Tue, Oct 28, 2025 at 04:12:54PM +0530, Srikar Dronamraju wrote:
> > At present, scheduler scales CPU capacity for fair tasks based on time
> > spent on irq and steal time. If a CPU sees irq or steal time, its
> > capacity for fair tasks decreases causing tasks to migrate to other CPU
> > that are not affected by irq and steal time. All of this is gated by
> > NONTASK_CAPACITY.
> > 
> > In virtualized setups, a CPU that reports steal time (time taken by the
> > hypervisor) can cause tasks to migrate unnecessarily to sibling CPUs that
> > appear to be less busy, only for the situation to reverse shortly.
> > 
> > To mitigate this ping-pong behaviour, this change introduces a new
> > scheduler feature flag: ACCT_STEAL which will control whether steal time
> > contributes to non-task capacity adjustments (used for fair scheduling).
> 
> Please don't use sched_feat like this. If this is something that wants
> to be set by architectures move it to a normal static_branch (like eg.
> sched_energy_present, sched_asymc_cpucapacity, sched_cluster_active,
> sched_smt_present, sched_numa_balancing etc.).

Ok, Peter, will move it to a static_branch approach and post a v2.

-- 
Thanks and Regards
Srikar Dronamraju


More information about the Linuxppc-dev mailing list