[RFC PATCH] powerpc/powernv: Add winkle as cpuidle state

Stewart Smith stewart at linux.vnet.ibm.com
Fri Nov 27 14:06:36 AEDT 2015


This is a *very* RFC patch and one that is no doubt missing something
fairly important that I don't even know about (along with the things
that I know it's lacking).

i.e. don't merge this.

However, posting for discussion of the possibility of enabling winkle
as a cpuidle state on powernv as currently it's just used if we
hotplug out the CPU.

Using powertop with the following two patches:
https://lists.01.org/pipermail/powertop/2015-November/001852.html
https://lists.01.org/pipermail/powertop/2015-November/001854.html
also at: https://github.com/stewart-ibm/powertop/tree/power

I was able to easily measure power consumption of an idle Ubuntu 15.04
system running 4.4.0-rc2+ when idle. With the addition of this patch,
with SMT=off I observed a solid 20W power saving on a dual socket 20 core
system.

It seems that we have a fair few wakeups caused by IPI and dbs_timer.
There's a lot more of them with SMT=8 which means we spend a whole bunch
less time in winkle.

After letting the machine be idle for a while, powertop was telling me
the lead causes of wakeup (SMT=off) were:
              25.0 µs/s      21.5        Interrupt      [16] IPI
              90.9 µs/s      17.5        kWork          dbs_timer
              28.5 µs/s       5.3        Interrupt      [3] net_rx(softirq)

Even with that though, with cores in around 50% winkle, 20W is a pretty
solid power saving that may make us want to reconsider the commonly
held wisdom that the latency of coming out of winkle isn't worth it over
just fastsleep and nap.

With some investigation as to why we're relatively often waking up, we
could get closer to the ideal situation in this workload (well, absence
of workload) in pretty much completely powering off everything but a
single core.

Note that I have *NOT* run any benchmarks on this (and we obviously want
to get the residency and latency measurements correct, likely direct
from firmware... and get them right there too).

Signed-off-by: Stewart Smith <stewart at linux.vnet.ibm.com>
---
 drivers/cpuidle/cpuidle-powernv.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
index 845bafcfa792..50726f11ac80 100644
--- a/drivers/cpuidle/cpuidle-powernv.c
+++ b/drivers/cpuidle/cpuidle-powernv.c
@@ -93,6 +93,14 @@ static int fastsleep_loop(struct cpuidle_device *dev,
 	return index;
 }
 #endif
+
+static int winkle_loop(struct cpuidle_device *dev,
+				struct cpuidle_driver *drv,
+				int index)
+{
+	power7_winkle();
+	return index;
+}
 /*
  * States for dedicated partition case.
  */
@@ -235,6 +243,13 @@ static int powernv_add_idle_states(void)
 			powernv_states[nr_idle_states].enter = &fastsleep_loop;
 		}
 #endif
+		if (flags[i] & OPAL_PM_WINKLE_ENABLED) {
+			strcpy(powernv_states[nr_idle_states].name, "winkle");
+			strcpy(powernv_states[nr_idle_states].desc, "winkle");
+			powernv_states[nr_idle_states].flags = CPUIDLE_FLAG_TIMER_STOP;
+			powernv_states[nr_idle_states].target_residency = 3000000;
+			powernv_states[nr_idle_states].enter = &winkle_loop;
+		}
 		powernv_states[nr_idle_states].exit_latency =
 				((unsigned int)latency_ns[i]) / 1000;
 
-- 
2.1.4



More information about the Linuxppc-dev mailing list