[PATCH 11/11] powerpc/smp: Add a doorbell=off kernel parameter
Cédric Le Goater
clg at kaod.org
Fri Nov 12 03:01:15 AEDT 2021
On 11/11/21 11:41, Michael Ellerman wrote:
> Cédric Le Goater <clg at kaod.org> writes:
>> On processors with a XIVE interrupt controller (POWER9 and above), the
>> kernel can use either doorbells or XIVE to generate CPU IPIs. Sending
>> doorbell is generally preferred to using the XIVE IC because it is
>> faster. There are cases where we want to avoid doorbells and use XIVE
>> only, for debug or performance. Only useful on POWER9 and above.
>
> How much do we want this?
Yes. Thanks for asking. It is a recent need.
Here is some background I should have added in the first place. May be
for a v2.
We have different ways of doing IPIs on POWER9 and above processors,
depending on the platform and the underlying hypervisor.
- PowerNV uses global doorbells
- pSeries/KVM uses XIVE only because local doorbells are not
efficient, as there are emulated in the KVM hypervisor
- pSeries/PowerVM uses XIVE for remote cores and local doorbells for
threads on same core (SMT4 or 8)
This recent commit 5b06d1679f2f ("powerpc/pseries: Use doorbells even
if XIVE is available") introduced the optimization for PowerVM and
commit 107c55005fbd ("powerpc/pseries: Add KVM guest doorbell
restrictions") restricted the optimization.
We would like a way to turn off the optimization.
> Kernel command line args are a bit of a pain, they tend to be poorly
> tested, because someone has to explicitly enable them at boot time,
> and then reboot to test the other case.
True. The "xive=off" parameter was poorly tested initially.
> When would we want to enable this?
For bring-up, for debug, for tests. I have been using a similar switch
to compare the XIVE interrupt controller performance with doorbells on
POWER9 and P0WER10.
A new need arises with PowerVM, some configurations will behave as KVM
(local doorbell are unsupported) and the doorbell=off parameter is a
simple way to handle this case today.
> Can we make the kernel smarter about when to use doorbells and make
> it automated?
I don't think we want to probe all IPI methods to detect how well
local doorbells are supported on the platform. Do we ?
A machine property/feature would be cleaner. It is a global CPU
property but I don't know where to put it. Ideas ?
> Could we make it a runtime switch?
We can. See the patch below. It covers the need for test/performance
but it won't work on a PowerVM system not supporting local doorbells
since boot will fail as soon as secondaries are started. We need a way
to take a decision early on which method to activate.
Thanks
C.
From dcac8528c89b689217515032f3329ba5ea10085d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg at kaod.org>
Date: Fri, 5 Nov 2021 12:23:48 +0100
Subject: [PATCH] powerpc/xive: Add a debugfs toggle to select xive for IPIs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
For performance tests only.
Signed-off-by: Cédric Le Goater <clg at kaod.org>
---
arch/powerpc/sysdev/xive/common.c | 26 ++++++++++++++++++++++++++
1 file changed, 26 insertions(+)
diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 39142df828a018..9ee36b95f9c545 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1826,6 +1826,30 @@ static int xive_eq_debug_show(struct seq_file *m, void *private)
}
DEFINE_SHOW_ATTRIBUTE(xive_eq_debug);
+static int xive_ipi_cause_debug_set(void *data, u64 val)
+{
+ static void (*do_ipi)(int cpu);
+
+ if (val) {
+ do_ipi = smp_ops->cause_ipi;
+ smp_ops->cause_ipi = xive_cause_ipi;
+ } else {
+ if (do_ipi)
+ smp_ops->cause_ipi = do_ipi;
+ }
+
+ return 0;
+}
+
+static int xive_ipi_cause_debug_get(void *data, u64 *val)
+{
+ *val = xive_cause_ipi == smp_ops->cause_ipi;
+ return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(xive_ipi_cause_debug_fops, xive_ipi_cause_debug_get,
+ xive_ipi_cause_debug_set, "%llu\n");
+
static void xive_core_debugfs_create(void)
{
struct dentry *xive_dir;
@@ -1849,6 +1873,8 @@ static void xive_core_debugfs_create(void)
}
debugfs_create_bool("store-eoi", 0600, xive_dir, &xive_store_eoi);
debugfs_create_bool("save-restore", 0600, xive_dir, &xive_has_save_restore);
+ debugfs_create_file("ipi-cause", 0600, xive_dir,
+ NULL, &xive_ipi_cause_debug_fops);
}
#endif /* CONFIG_DEBUG_FS */
More information about the Linuxppc-dev
mailing list