[PATCH 11/11] powerpc/smp: Add a doorbell=off kernel parameter

Cédric Le Goater clg at kaod.org
Fri Nov 12 03:01:15 AEDT 2021


On 11/11/21 11:41, Michael Ellerman wrote:
> Cédric Le Goater <clg at kaod.org> writes:
>> On processors with a XIVE interrupt controller (POWER9 and above), the
>> kernel can use either doorbells or XIVE to generate CPU IPIs. Sending
>> doorbell is generally preferred to using the XIVE IC because it is
>> faster. There are cases where we want to avoid doorbells and use XIVE
>> only, for debug or performance. Only useful on POWER9 and above.
> 
> How much do we want this?

Yes. Thanks for asking. It is a recent need.

Here is some background I should have added in the first place. May be
for a v2.

We have different ways of doing IPIs on POWER9 and above processors,
depending on the platform and the underlying hypervisor.

- PowerNV uses global doorbells

- pSeries/KVM uses XIVE only because local doorbells are not
   efficient, as there are emulated in the KVM hypervisor

- pSeries/PowerVM uses XIVE for remote cores and local doorbells for
   threads on same core (SMT4 or 8)

This recent commit 5b06d1679f2f ("powerpc/pseries: Use doorbells even
if XIVE is available") introduced the optimization for PowerVM and
commit 107c55005fbd ("powerpc/pseries: Add KVM guest doorbell
restrictions") restricted the optimization.

We would like a way to turn off the optimization.

> Kernel command line args are a bit of a pain, they tend to be poorly
> tested, because someone has to explicitly enable them at boot time,
> and then reboot to test the other case.

True. The "xive=off" parameter was poorly tested initially.

> When would we want to enable this?

For bring-up, for debug, for tests. I have been using a similar switch
to compare the XIVE interrupt controller performance with doorbells on
POWER9 and P0WER10.

A new need arises with PowerVM, some configurations will behave as KVM
(local doorbell are unsupported) and the doorbell=off parameter is a
simple way to handle this case today.

> Can we make the kernel smarter about when to use doorbells and make
> it automated?

I don't think we want to probe all IPI methods to detect how well
local doorbells are supported on the platform. Do we ?

A machine property/feature would be cleaner. It is a global CPU
property but I don't know where to put it. Ideas ?

> Could we make it a runtime switch?

We can. See the patch below. It covers the need for test/performance
but it won't work on a PowerVM system not supporting local doorbells
since boot will fail as soon as secondaries are started. We need a way
to take a decision early on which method to activate.


Thanks

C.

 From dcac8528c89b689217515032f3329ba5ea10085d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg at kaod.org>
Date: Fri, 5 Nov 2021 12:23:48 +0100
Subject: [PATCH] powerpc/xive: Add a debugfs toggle to select xive for IPIs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

For performance tests only.

Signed-off-by: Cédric Le Goater <clg at kaod.org>
---
  arch/powerpc/sysdev/xive/common.c | 26 ++++++++++++++++++++++++++
  1 file changed, 26 insertions(+)

diff --git a/arch/powerpc/sysdev/xive/common.c b/arch/powerpc/sysdev/xive/common.c
index 39142df828a018..9ee36b95f9c545 100644
--- a/arch/powerpc/sysdev/xive/common.c
+++ b/arch/powerpc/sysdev/xive/common.c
@@ -1826,6 +1826,30 @@ static int xive_eq_debug_show(struct seq_file *m, void *private)
  }
  DEFINE_SHOW_ATTRIBUTE(xive_eq_debug);
  
+static int xive_ipi_cause_debug_set(void *data, u64 val)
+{
+	static void (*do_ipi)(int cpu);
+
+	if (val) {
+		do_ipi = smp_ops->cause_ipi;
+		smp_ops->cause_ipi = xive_cause_ipi;
+	} else {
+		if (do_ipi)
+			smp_ops->cause_ipi = do_ipi;
+	}
+
+	return 0;
+}
+
+static int xive_ipi_cause_debug_get(void *data, u64 *val)
+{
+	*val = xive_cause_ipi == smp_ops->cause_ipi;
+	return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(xive_ipi_cause_debug_fops, xive_ipi_cause_debug_get,
+			 xive_ipi_cause_debug_set, "%llu\n");
+
  static void xive_core_debugfs_create(void)
  {
  	struct dentry *xive_dir;
@@ -1849,6 +1873,8 @@ static void xive_core_debugfs_create(void)
  	}
  	debugfs_create_bool("store-eoi", 0600, xive_dir, &xive_store_eoi);
  	debugfs_create_bool("save-restore", 0600, xive_dir, &xive_has_save_restore);
+	debugfs_create_file("ipi-cause", 0600, xive_dir,
+			    NULL, &xive_ipi_cause_debug_fops);
  }
  
  #endif /* CONFIG_DEBUG_FS */


More information about the Linuxppc-dev mailing list