[PATCH] cpuidle/powernv: Fix snooze timeout

Rafael J. Wysocki rafael at kernel.org
Thu Jun 23 08:49:06 AEST 2016


On Wed, Jun 22, 2016 at 9:36 PM, Shreyas B. Prabhu
<shreyas at linux.vnet.ibm.com> wrote:
> Snooze is a poll idle state in powernv and pseries platforms. Snooze
> has a timeout so that if a cpu stays in snooze for more than target
> residency of the next available idle state, then it would exit thereby
> giving chance to the cpuidle governor to re-evaluate and
> promote the cpu to a deeper idle state. Therefore whenever snooze exits
> due to this timeout, its last_residency will be target_residency of next
> deeper state.
>
> commit e93e59ce5b85 ("cpuidle: Replace ktime_get() with local_clock()")
> changed the math around last_residency calculation. Specifically, while
> converting last_residency value from nanoseconds to microseconds it does
> right shift by 10. Due to this, in snooze timeout exit scenarios
> last_residency calculated is roughly 2.3% less than target_residency of
> next available state. This pattern is picked up get_typical_interval()
> in the menu governor and therefore expected_interval in menu_select() is
> frequently less than the target_residency of any state but snooze.
>
> Due to this we are entering snooze at a higher rate, thereby affecting
> the single thread performance.
> Since the math around last_residency is not meant to be precise, fix this
> issue setting snooze timeout to 105% of target_residency of next
> available idle state.
>
> This also adds comment around why snooze timeout is necessary.

Daniel, any comments?

> Reported-by: Anton Blanchard <anton at samba.org>
> Signed-off-by: Shreyas B. Prabhu <shreyas at linux.vnet.ibm.com>
> ---
>  drivers/cpuidle/cpuidle-powernv.c | 14 ++++++++++++++
>  drivers/cpuidle/cpuidle-pseries.c | 13 +++++++++++++
>  2 files changed, 27 insertions(+)
>
> diff --git a/drivers/cpuidle/cpuidle-powernv.c b/drivers/cpuidle/cpuidle-powernv.c
> index e12dc30..5835491 100644
> --- a/drivers/cpuidle/cpuidle-powernv.c
> +++ b/drivers/cpuidle/cpuidle-powernv.c
> @@ -268,10 +268,24 @@ static int powernv_idle_probe(void)
>                 cpuidle_state_table = powernv_states;
>                 /* Device tree can indicate more idle states */
>                 max_idle_state = powernv_add_idle_states();
> +
> +               /*
> +                * Staying in snooze for a long period can degrade the
> +                * perfomance of the sibling cpus. Set timeout for snooze such
> +                * that if the cpu stays in snooze longer than target residency
> +                * of the next available idle state then exit from snooze. This
> +                * gives a chance to the cpuidle governor to re-evaluate and
> +                * promote it to deeper idle states.
> +                */
>                 if (max_idle_state > 1) {
>                         snooze_timeout_en = true;
>                         snooze_timeout = powernv_states[1].target_residency *
>                                          tb_ticks_per_usec;
> +                       /*
> +                        * Give a 5% margin since target residency related math
> +                        * is not precise in cpuidle core.
> +                        */
> +                       snooze_timeout += snooze_timeout / 20;
>                 }
>         } else
>                 return -ENODEV;
> diff --git a/drivers/cpuidle/cpuidle-pseries.c b/drivers/cpuidle/cpuidle-pseries.c
> index 07135e0..22de841 100644
> --- a/drivers/cpuidle/cpuidle-pseries.c
> +++ b/drivers/cpuidle/cpuidle-pseries.c
> @@ -250,10 +250,23 @@ static int pseries_idle_probe(void)
>         } else
>                 return -ENODEV;
>
> +       /*
> +        * Staying in snooze for a long period can degrade the
> +        * perfomance of the sibling cpus. Set timeout for snooze such
> +        * that if the cpu stays in snooze longer than target residency
> +        * of the next available idle state then exit from snooze. This
> +        * gives a chance to the cpuidle governor to re-evaluate and
> +        * promote it to deeper idle states.
> +        */
>         if (max_idle_state > 1) {
>                 snooze_timeout_en = true;
>                 snooze_timeout = cpuidle_state_table[1].target_residency *
>                                  tb_ticks_per_usec;
> +               /*
> +                * Give a 5% margin since target residency related math
> +                * is not precise in cpuidle core.
> +                */
> +               snooze_timeout += snooze_timeout / 20;
>         }
>         return 0;
>  }
> --
> 2.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


More information about the Linuxppc-dev mailing list