[PATCHv4 2/2] powerpc: implement arch_scale_smt_power for Power7
Michael Neuling
mikey at neuling.org
Wed Feb 24 22:58:26 EST 2010
In message <11927.1267010024 at neuling.org> you wrote:
> > > If there's less the group will normally be balanced and we fall out and
> > > end up in check_asym_packing().
> > >
> > > So what I tried doing with that loop is detect if there's a hole in the
> > > packing before busiest. Now that I think about it, what we need to check
> > > is if this_cpu (the removed cpu argument) is idle and less than busiest.
> > >
> > > So something like:
> > >
> > > static int check_asym_pacing(struct sched_domain *sd,
> > > struct sd_lb_stats *sds,
> > > int this_cpu, unsigned long *imbalance)
> > > {
> > > int busiest_cpu;
> > >
> > > if (!(sd->flags & SD_ASYM_PACKING))
> > > return 0;
> > >
> > > if (!sds->busiest)
> > > return 0;
> > >
> > > busiest_cpu = group_first_cpu(sds->busiest);
> > > if (cpu_rq(this_cpu)->nr_running || this_cpu > busiest_cpu)
> > > return 0;
> > >
> > > *imbalance = (sds->max_load * sds->busiest->cpu_power) /
> > > SCHED_LOAD_SCALE;
> > > return 1;
> > > }
> > >
> > > Does that make sense?
> >
> > I think so.
> >
> > I'm seeing check_asym_packing do the right thing with the simple SMT2
> > with 1 process case. It marks cpu0 as imbalanced when cpu0 is idle and
> > cpu1 is busy.
> >
> > Unfortunately the process doesn't seem to be get migrated down though.
> > Do we need to give *imbalance a higher value?
>
> So with ego help, I traced this down a bit more.
>
> In my simple test case (SMT2, t0 idle, t1 active) if f_b_g() hits our
> new case in check_asym_packing(), load_balance then runs f_b_q().
> f_b_q() has this:
>
> if (capacity && rq->nr_running == 1 && wl > imbalance)
> continue;
>
> when check_asym_packing() hits, wl = 1783 and imbalance = 1024, so we
> continue and busiest remains NULL.
>
> load_balance then does "goto out_balanced" and it doesn't attempt to
> move the task.
>
> Based on this and on egos suggestion I pulled in Suresh Siddha patch
> from: http://lkml.org/lkml/2010/2/12/352. This fixes the problem. The
> process is moved down to t0.
>
> I've only tested SMT2 so far.
SMT4 also works in the simple test case of a single process being pulled
down to thread 0.
As you suspected though, unfortunately this is only working with
CONFIG_NO_HZ off. If I turn NO_HZ on, my single process gets bounced
around the core.
Did you think of any ideas for how to fix the NO_HZ interaction?
Mikey
More information about the Linuxppc-dev
mailing list