[RFC 1/1] sched/fair: Consider asymmetric scheduler groups in load balancer

Tobias Huschle huschle at linux.ibm.com
Fri Jul 7 17:44:55 AEST 2023


On 2023-07-04 15:40, Peter Zijlstra wrote:
> On Mon, May 15, 2023 at 01:46:01PM +0200, Tobias Huschle wrote:
>> The current load balancer implementation implies that scheduler 
>> groups,
>> within the same domain, all host the same number of CPUs. This is
>> reflected in the condition, that a scheduler group, which is load
>> balancing and classified as having spare capacity, should pull work
>> from the busiest group, if the local group runs less processes than
>> the busiest one. This implies that these two groups should run the
>> same number of processes, which is problematic if the groups are not
>> of the same size.
>> 
>> The assumption that scheduler groups within the same scheduler domain
>> host the same number of CPUs appears to be true for non-s390
>> architectures.
> 
> Mostly; there's historically the cpuset case where we can create
> lopsided groups like that. And today we're growing all these hybrid
> things that will also tickle this, except they're looking towards
> different balancer extentions to deal with the IPC difference so might
> not be immediately caring about this here issue.
> 
> 
>> Signed-off-by: Tobias Huschle <huschle at linux.ibm.com>
>> ---
>>  kernel/sched/fair.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 48b6f0ca13ac..b1307d7e4065 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -10426,7 +10426,8 @@ static struct sched_group 
>> *find_busiest_group(struct lb_env *env)
>>  	 * group's child domain.
>>  	 */
>>  	if (sds.prefer_sibling && local->group_type == group_has_spare &&
>> -	    busiest->sum_nr_running > local->sum_nr_running + 1)
>> +	    busiest->sum_nr_running * local->group_weight >
>> +			local->sum_nr_running * busiest->group_weight + 1)
> 
> Should that not be: busiest->group_weight * (local->sum_nr_running + 1) 
> ?

I agree, adding the brackets makes more sense and is clearer on what's
intended by this check while also preserving the original behavior for
local->group_weight == busiest->group_weight

> 
> I'm not opposed to this -- it seems fairly straight forward.

Appreciated, I will go ahead and send a patch once I incorporated the 
other feedback I got.
Thanks.

> 
>>  		goto force_balance;
>> 
>>  	if (busiest->group_type != group_overloaded) {
>> --
>> 2.34.1
>> 


More information about the Linuxppc-dev mailing list