[PATCH v3 5/6] sched/fair: Consider SMT in ASYM_PACKING load balance

Ricardo Neri ricardo.neri-calderon at linux.intel.com
Wed May 19 05:07:40 AEST 2021


On Fri, May 14, 2021 at 07:14:15PM -0700, Ricardo Neri wrote:
> On Fri, May 14, 2021 at 11:47:45AM +0200, Peter Zijlstra wrote:
> > On Thu, May 13, 2021 at 08:49:08AM -0700, Ricardo Neri wrote:
> > >  include/linux/sched/topology.h |   1 +
> > >  kernel/sched/fair.c            | 101 +++++++++++++++++++++++++++++++++
> > >  2 files changed, 102 insertions(+)
> > > 
> > > diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
> > > index 8f0f778b7c91..43bdb8b1e1df 100644
> > > --- a/include/linux/sched/topology.h
> > > +++ b/include/linux/sched/topology.h
> > > @@ -57,6 +57,7 @@ static inline int cpu_numa_flags(void)
> > >  #endif
> > >  
> > >  extern int arch_asym_cpu_priority(int cpu);
> > > +extern bool arch_asym_check_smt_siblings(void);
> > >  
> > >  struct sched_domain_attr {
> > >  	int relax_domain_level;
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index c8b66a5d593e..3d6cc027e6e6 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -106,6 +106,15 @@ int __weak arch_asym_cpu_priority(int cpu)
> > >  	return -cpu;
> > >  }
> > >  
> > > +/*
> > > + * For asym packing, first check the state of SMT siblings before deciding to
> > > + * pull tasks.
> > > + */
> > > +bool __weak arch_asym_check_smt_siblings(void)
> > > +{
> > > +	return false;
> > > +}
> > > +
> > >  /*
> > >   * The margin used when comparing utilization with CPU capacity.
> > >   *
> > 
> > > @@ -8458,6 +8550,9 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *sds,  struct sg_lb_stats *sgs
> > >  	if (group == sds->local)
> > >  		return false;
> > >  
> > > +	if (arch_asym_check_smt_siblings())
> > > +		return asym_can_pull_tasks(env->dst_cpu, sds, sgs, group);
> > > +
> > >  	return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu);
> > >  }
> > 
> > So I'm thinking that this is a property of having ASYM_PACKING at a core
> > level, rather than some arch special. Wouldn't something like this be
> > more appropriate?
> > 
> > ---
> > --- a/include/linux/sched/topology.h
> > +++ b/include/linux/sched/topology.h
> > @@ -57,7 +57,6 @@ static inline int cpu_numa_flags(void)
> >  #endif
> >  
> >  extern int arch_asym_cpu_priority(int cpu);
> > -extern bool arch_asym_check_smt_siblings(void);
> >  
> >  struct sched_domain_attr {
> >  	int relax_domain_level;
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -107,15 +107,6 @@ int __weak arch_asym_cpu_priority(int cp
> >  }
> >  
> >  /*
> > - * For asym packing, first check the state of SMT siblings before deciding to
> > - * pull tasks.
> > - */
> > -bool __weak arch_asym_check_smt_siblings(void)
> > -{
> > -	return false;
> > -}
> > -
> > -/*
> >   * The margin used when comparing utilization with CPU capacity.
> >   *
> >   * (default: ~20%)
> > @@ -8550,7 +8541,8 @@ sched_asym(struct lb_env *env, struct sd
> >  	if (group == sds->local)
> >  		return false;
> >  
> > -	if (arch_asym_check_smt_siblings())
> > +	if ((sds->local->flags & SD_SHARE_CPUCAPACITY) ||
> > +	    (group->flags & SD_SHARE_CPUCAPACITY))
> >  		return asym_can_pull_tasks(env->dst_cpu, sds, sgs, group);
> 
> Thanks Peter for the quick review! This makes sense to me. The only
> reason we proposed arch_asym_check_smt_siblings() is because we were
> about breaking powerpc (I need to study how they set priorities for SMT,
> if applicable). If you think this is not an issue I can post a
> v4 with this update.

As far as I can see, priorities in powerpc are set by the CPU number.
However, I am not sure how CPUs are enumerated? If CPUs in brackets are
SMT sibling, Does an enumeration looks like A) [0, 1], [2, 3] or B) [0, 2],
[1, 3]? I guess B is the right answer. Otherwise, both SMT siblings of a
core would need to be busy before a new core is used.

Still, I think the issue described in the cover letter may be
reproducible in powerpc as well. If CPU3 is offlined, and [0, 2] pulled
tasks from [1, -] so that both CPU0 and CPU2 become busy, CPU1 would not be
able to help since CPU0 has the highest priority.

I am cc'ing the linuxppc list to get some feedback.

Thanks and BR,
Ricardo


More information about the Linuxppc-dev mailing list