[PATCH v4 09/10] Powerpc/smp: Create coregroup domain

Srikar Dronamraju srikar at linux.vnet.ibm.com
Wed Jul 29 16:13:55 AEST 2020


* Valentin Schneider <valentin.schneider at arm.com> [2020-07-28 16:03:11]:

Hi Valentin,

Thanks for looking into the patches.

> On 27/07/20 06:32, Srikar Dronamraju wrote:
> > Add percpu coregroup maps and masks to create coregroup domain.
> > If a coregroup doesn't exist, the coregroup domain will be degenerated
> > in favour of SMT/CACHE domain.
> >
> 
> So there's at least one arm64 platform out there with the same "pairs of
> cores share L2" thing (Ampere eMAG), and that lives quite happily with the
> default scheduler topology (SMT/MC/DIE). Each pair of core gets its MC
> domain, and the whole system is covered by DIE.
> 
> Now arguably it's not a perfect representation; DIE doesn't have
> SD_SHARE_PKG_RESOURCES so the highest level sd_llc can point to is MC. That
> will impact all callsites using cpus_share_cache(): in the eMAG case, only
> pairs of cores will be seen as sharing cache, even though *all* cores share
> the same L3.
> 

Okay, Its good to know that we have a chip which is similar to P9 in
topology.

> I'm trying to paint a picture of what the P9 topology looks like (the one
> you showcase in your cover letter) to see if there are any similarities;
> from what I gather in [1], wikichips and your cover letter, with P9 you can
> have something like this in a single DIE (somewhat unsure about L3 setup;
> it looks to be distributed?)
> 
>      +---------------------------------------------------------------------+
>      |                                  L3                                 |
>      +---------------+-+---------------+-+---------------+-+---------------+
>      |       L2      | |       L2      | |       L2      | |       L2      |
>      +------+-+------+ +------+-+------+ +------+-+------+ +------+-+------+
>      |  L1  | |  L1  | |  L1  | |  L1  | |  L1  | |  L1  | |  L1  | |  L1  |
>      +------+ +------+ +------+ +------+ +------+ +------+ +------+ +------+
>      |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs| |4 CPUs|
>      +------+ +------+ +------+ +------+ +------+ +------+ +------+ +------+
> 
> Which would lead to (ignoring the whole SMT CPU numbering shenanigans)
> 
> NUMA     [                                                   ...
> DIE      [                                             ]
> MC       [         ] [         ] [         ] [         ]
> BIGCORE  [         ] [         ] [         ] [         ]
> SMT      [   ] [   ] [   ] [   ] [   ] [   ] [   ] [   ]
>          00-03 04-07 08-11 12-15 16-19 20-23 24-27 28-31  <other node here>
> 

What you have summed up is perfectly what a P9 topology looks like. I dont
think I could have explained it better than this.

> This however has MC == BIGCORE; what makes it you can have different spans
> for these two domains? If it's not too much to ask, I'd love to have a P9
> topology diagram.
> 
> [1]: 20200722081822.GG9290 at linux.vnet.ibm.com

At this time the current topology would be good enough i.e BIGCORE would
always be equal to a MC. However in future we could have chips that can have
lesser/larger number of CPUs in llc than in a BIGCORE or we could have
granular or split L3 caches within a DIE. In such a case BIGCORE != MC.

Also in the current P9 itself, two neighbouring core-pairs form a quad.
Cache latency within a quad is better than a latency to a distant core-pair.
Cache latency within a core pair is way better than latency within a quad.
So if we have only 4 threads running on a DIE all of them accessing the same
cache-lines, then we could probably benefit if all the tasks were to run
within the quad aka MC/Coregroup.

I have found some benchmarks which are latency sensitive to benefit by
having a grouping a quad level (using kernel hacks and not backed by
firmware changes). Gautham also found similar results in his experiments
but he only used binding within the stock kernel.

I am not setting SD_SHARE_PKG_RESOURCES in MC/Coregroup sd_flags as in MC
domain need not be LLC domain for Power.

-- 
Thanks and Regards
Srikar Dronamraju


More information about the Linuxppc-dev mailing list