[PATCH 1/2] sched: Feature to decide if steal should update CPU capacity

Peter Zijlstra peterz at infradead.org
Tue Oct 28 22:18:13 AEDT 2025


On Tue, Oct 28, 2025 at 04:12:54PM +0530, Srikar Dronamraju wrote:
> At present, scheduler scales CPU capacity for fair tasks based on time
> spent on irq and steal time. If a CPU sees irq or steal time, its
> capacity for fair tasks decreases causing tasks to migrate to other CPU
> that are not affected by irq and steal time. All of this is gated by
> NONTASK_CAPACITY.
> 
> In virtualized setups, a CPU that reports steal time (time taken by the
> hypervisor) can cause tasks to migrate unnecessarily to sibling CPUs that
> appear to be less busy, only for the situation to reverse shortly.
> 
> To mitigate this ping-pong behaviour, this change introduces a new
> scheduler feature flag: ACCT_STEAL which will control whether steal time
> contributes to non-task capacity adjustments (used for fair scheduling).

Please don't use sched_feat like this. If this is something that wants
to be set by architectures move it to a normal static_branch (like eg.
sched_energy_present, sched_asymc_cpucapacity, sched_cluster_active,
sched_smt_present, sched_numa_balancing etc.).


More information about the Linuxppc-dev mailing list