[PATCH v5 3/4] arm64: topology: Support SMT control on ACPI based system
Yicong Yang
yangyicong at huawei.com
Thu Aug 29 17:40:54 AEST 2024
Hi Pierre,
On 2024/8/27 23:40, Pierre Gondois wrote:
> Hello Yicong,
> IIUC we are looking for the maximum number of threads a CPU can have
> in the platform. But is is actually possible to have a platform with
> CPUs not having the same number of threads ?
>
I was thinking of the heterogenous platforms, for example part of the
cores have SMT and others don't (I don't have any though, but there
should be some such platforms for arm64).
> For instance kernel/cpu.c::cpu_smt_num_threads_valid() will check
> that the number of threads is either 1 or cpu_smt_max_threads (as
> CONFIG_SMT_NUM_THREADS_DYNAMIC is not enabled for arm64). However
> a (hypothetical) platform with:
> - CPU0: 2 threads
> - CPU1: 4 threads
> should not be able to set the number of threads to 2, as
> 1 < 2 < cpu_smt_max_threads (cf. cpu_smt_num_threads_valid()).
>
It's a bit more complex. If we enable/disable the SMT using on/off control
then we won't have this problem. We'll online all the CPUs on enabling and
offline CPUs which is not primary thread and don't care about the thread
number of each core.
Control using thread number is introduced by CONFIG_SMT_NUM_THREADS_DYNAMIC
and only enabled on powerpc. I think this interface is not enough to handle
the hypothetical we assumed, since it assumes the homogenous platform that
all the CPUs have the same thread number. But this should be fine for
the platforms with SMT(with same thread number) and non-SMT cores, since it do
indicates the real thread number of the SMT cores.
> So if there is an assumption that all the CPUs have the same number of
> threads, it should be sufficient to count the number of threads for one
> CPU right ?
>
Naturally and conveniently I may think use of the threads number of CPU0 (boot
cpu) in such a solution. But this will be wrong if CPU0 is non-SMT on a heterogenous
platform :(
> In the other (unlikely) case (i.e. CPUs can have various number of threads),
> I think we should either:
> - use your method and check that all the CPUs have the same number of threads
> - get the number of threads for one CPU and check that all the CPUs are
> identical using the ACPI_PPTT_ACPI_IDENTICAL tag
I think this won't be simpler since we still need to iterate all the CPUs to see
if they have the same hetero_id (checked how we're using this ACPI tag in
arm_acpi_register_pmu_device()). We could already know if they're identical by
comparing the thread number and do the update if necessary.
> - have a per-cpu cpu_smt_max_threads value ?
This should be unnecessary in currently stage if there's no platforms
with several kind cores have different thread number like in your assumption
and enable CONFIG_SMT_NUM_THREADS_DYNAMIC on such platforms. Otherwise using
a global cpu_smt_max_threads to enable the SMT control should be enough.
Does it make sense?
Thanks,
Yicong
>
> Same comment for the DT patch. If there is an assumption that all CPUs have
> the same number of threads, then update_smt_num_threads() could only be called
> once I suppose,
>
> Regards,
> Pierre
>
>
> On 8/6/24 10:53, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong at hisilicon.com>
>>
>> For ACPI we'll build the topology from PPTT and we cannot directly
>> get the SMT number of each core. Instead using a temporary xarray
>> to record the SMT number of each core when building the topology
>> and we can know the largest SMT number in the system. Then we can
>> enable the support of SMT control.
>>
>> Signed-off-by: Yicong Yang <yangyicong at hisilicon.com>
>> ---
>> arch/arm64/kernel/topology.c | 24 ++++++++++++++++++++++++
>> 1 file changed, 24 insertions(+)
>>
>> diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
>> index 1a2c72f3e7f8..f72e1e55b05e 100644
>> --- a/arch/arm64/kernel/topology.c
>> +++ b/arch/arm64/kernel/topology.c
>> @@ -15,8 +15,10 @@
>> #include <linux/arch_topology.h>
>> #include <linux/cacheinfo.h>
>> #include <linux/cpufreq.h>
>> +#include <linux/cpu_smt.h>
>> #include <linux/init.h>
>> #include <linux/percpu.h>
>> +#include <linux/xarray.h>
>> #include <asm/cpu.h>
>> #include <asm/cputype.h>
>> @@ -43,11 +45,16 @@ static bool __init acpi_cpu_is_threaded(int cpu)
>> */
>> int __init parse_acpi_topology(void)
>> {
>> + int thread_num, max_smt_thread_num = 1;
>> + struct xarray core_threads;
>> int cpu, topology_id;
>> + void *entry;
>> if (acpi_disabled)
>> return 0;
>> + xa_init(&core_threads);
>> +
>> for_each_possible_cpu(cpu) {
>> topology_id = find_acpi_cpu_topology(cpu, 0);
>> if (topology_id < 0)
>> @@ -57,6 +64,20 @@ int __init parse_acpi_topology(void)
>> cpu_topology[cpu].thread_id = topology_id;
>> topology_id = find_acpi_cpu_topology(cpu, 1);
>> cpu_topology[cpu].core_id = topology_id;
>> +
>> + entry = xa_load(&core_threads, topology_id);
>> + if (!entry) {
>> + xa_store(&core_threads, topology_id,
>> + xa_mk_value(1), GFP_KERNEL);
>> + } else {
>> + thread_num = xa_to_value(entry);
>> + thread_num++;
>> + xa_store(&core_threads, topology_id,
>> + xa_mk_value(thread_num), GFP_KERNEL);
>> +
>> + if (thread_num > max_smt_thread_num)
>> + max_smt_thread_num = thread_num;
>> + }
>> } else {
>> cpu_topology[cpu].thread_id = -1;
>> cpu_topology[cpu].core_id = topology_id;
>> @@ -67,6 +88,9 @@ int __init parse_acpi_topology(void)
>> cpu_topology[cpu].package_id = topology_id;
>> }
>> + cpu_smt_set_num_threads(max_smt_thread_num, max_smt_thread_num);
>> +
>> + xa_destroy(&core_threads);
>> return 0;
>> }
>> #endif
>
> .
More information about the Linuxppc-dev
mailing list