[Skiboot] [PATCH] power9-dd1:slw: Modify the stop0_lite latency & residency.

Michael Neuling mikey at neuling.org
Thu Apr 20 16:00:55 AEST 2017


On Wed, 2017-04-19 at 15:25 +0530, Gautham R. Shenoy wrote:
> From: "Gautham R. Shenoy" <ego at linux.vnet.ibm.com>
> 
> Currently skiboot exposes the exit-latency for stop0_lite as 200ns and
> the target-residency to be 2us.
> 
> However, the kernel cpu-idle infrastructure rounds up the latency to
> microseconds and lists the stop0_lite latency as 0us, putting it on
> par with snooze state. As a result, when the predicted latency is
> small (< 1us), cpuidle will select stop0_lite instead of snooze. The
> difference between these states is that snooze doesn't require an
> interrupt to exit from the state, but stop0_lite does. And the value
> 200ns doesn't include the interrupt latency.
> 
> This shows up in the context_switch2 benchmark
> (http://ozlabs.org/~anton/junkcode/context_switch2.c) where the number
> of context switches per second with the stop0_lite disabled is found
> to be roughly 30% more than with stop0_lite enabled.
> 
> ==============================================================================
> =
> x latency_200ns_residency_2us
> + latency_200ns_residency_2us_stop0_lite_disabled
>     N           Min           Max        Median           Avg        Stddev
> x 100        222784        473466        294510     302295.26       45380.6
> + 100        205316        609420        385198     396338.72     78135.648
> Difference at 99.0% confidence
> 	94043.5 +/- 23276.2
> 	31.1098% +/- 7.69983%
> 	(Student's t, pooled s = 63892.8)
> ==============================================================================
> =
> 
> This can be correlated with the number of times cpuidle enters
> stop0_lite compared to snooze.
> ===================================================================
> latency=200ns, residency=2us
>    stop0_lite enabled.
> 	* snooze usage      = 7
> 	* stop0 lite usage  = 3200324
> 	* stop1 lite usage  = 6
>  stop0_lite disabled
> 	* snooze usage: 287846
> 	* stop0_lite usage: 0
> 	* stop1_lite usage: 0
> ==================================================================
> 
> Hence, bump up the exit latency of stop0_lite to 1us. Since the target
> residency is chosen to be 10 times the exit latency, set the target
> residency to 10us.
> 
> With these values, we see a 50% improvement in the number of context
> switches:
> =====================================================================
> x latency_200ns_residency_2us
> + latency_1us_residency_10us
>     N           Min           Max        Median           Avg        Stddev
> x 100        222784        473466        294510     302295.26       45380.6
> + 100        281790        710784        514878     510224.62     85163.252
> Difference at 99.0% confidence
> 	207929 +/- 24858.3
> 	68.7835% +/- 8.22319%
> 	(Student's t, pooled s = 68235.5)
> =====================================================================
> 
> The cpuidle usage statistics show that we choose stop0_lite less often
> in such cases.

Do you have numbers on how much this changes power usage? I assume it gets worse
when we apply this.

Mikey

> 
> latency = 1us, residency = 10us
>     stop0_lite enabled
> 	* snooze usage      = 536808
> 	* stop0 lite usage  = 3
> 	* stop1 lite usage  = 7
> 
> Reported-by: Anton Blanchard <anton at samba.org>
> Signed-off-by: Gautham R. Shenoy <ego at linux.vnet.ibm.com>
> ---
>  hw/slw.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/slw.c b/hw/slw.c
> index b879173..bc8e967 100644
> --- a/hw/slw.c
> +++ b/hw/slw.c
> @@ -654,8 +654,8 @@ static struct cpu_idle_states power9_cpu_idle_states[] = {
>  static struct cpu_idle_states power9_dd1_cpu_idle_states[] = {
>  	{
>  		.name = "stop0_lite",
> -		.latency_ns = 200,
> -		.residency_ns = 2000,
> +		.latency_ns = 1000,
> +		.residency_ns = 10000,
>  		.flags = 0*OPAL_PM_DEC_STOP \
>  		       | 0*OPAL_PM_TIMEBASE_STOP  \
>  		       | 0*OPAL_PM_LOSE_USER_CONTEXT \


More information about the Skiboot mailing list