[PATCH 1/2] sched: Feature to decide if steal should update CPU capacity
Peter Zijlstra
peterz at infradead.org
Tue Oct 28 22:18:13 AEDT 2025
On Tue, Oct 28, 2025 at 04:12:54PM +0530, Srikar Dronamraju wrote:
> At present, scheduler scales CPU capacity for fair tasks based on time
> spent on irq and steal time. If a CPU sees irq or steal time, its
> capacity for fair tasks decreases causing tasks to migrate to other CPU
> that are not affected by irq and steal time. All of this is gated by
> NONTASK_CAPACITY.
>
> In virtualized setups, a CPU that reports steal time (time taken by the
> hypervisor) can cause tasks to migrate unnecessarily to sibling CPUs that
> appear to be less busy, only for the situation to reverse shortly.
>
> To mitigate this ping-pong behaviour, this change introduces a new
> scheduler feature flag: ACCT_STEAL which will control whether steal time
> contributes to non-task capacity adjustments (used for fair scheduling).
Please don't use sched_feat like this. If this is something that wants
to be set by architectures move it to a normal static_branch (like eg.
sched_energy_present, sched_asymc_cpucapacity, sched_cluster_active,
sched_smt_present, sched_numa_balancing etc.).
More information about the Linuxppc-dev
mailing list