[Skiboot] [PATCH] SLW: Increase stop4-5 residency by 10x
Vaidyanathan Srinivasan
svaidy at linux.vnet.ibm.com
Wed Mar 21 18:29:16 AEDT 2018
* Akshay Adiga <akshay.adiga at linux.vnet.ibm.com> [2018-03-21 08:57:36]:
> Using DGEMM benchmark we observed there was a drop of 5-9% throughput with
> and without stop4/5. In this benchmark the GPU waits on the cpu to wakeup
> and provide the subsequent data block to compute. The wakup latency
> accumulates over the run and shows up as a performance drop.
>
> Linux enters stop4/5 more aggressively for its wakeup latency. Increasing
> the residency from 1ms to 10ms makes the performance drop <1%
>
> Signed-off-by: Akshay Adiga <akshay.adiga at linux.vnet.ibm.com>
Acked-by: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy at linux.vnet.ibm.com>
> ---
> hw/slw.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/slw.c b/hw/slw.c
> index db238ec..515582b 100644
> --- a/hw/slw.c
> +++ b/hw/slw.c
> @@ -598,7 +598,7 @@ static struct cpu_idle_states power9_cpu_idle_states[] = {
> {
> .name = "stop4",
> .latency_ns = 100000,
> - .residency_ns = 1000000,
> + .residency_ns = 10000000,
> .flags = 0*OPAL_PM_DEC_STOP \
> | 0*OPAL_PM_TIMEBASE_STOP \
> | 1*OPAL_PM_LOSE_USER_CONTEXT \
> @@ -614,7 +614,7 @@ static struct cpu_idle_states power9_cpu_idle_states[] = {
> {
> .name = "stop5",
> .latency_ns = 200000,
> - .residency_ns = 2000000,
> + .residency_ns = 20000000,
> .flags = 0*OPAL_PM_DEC_STOP \
> | 0*OPAL_PM_TIMEBASE_STOP \
> | 1*OPAL_PM_LOSE_USER_CONTEXT \
Tuning the thresholds reduce the stop4/5 entry throughout the runtime
of the GPU benchmark/workload and recover performance. Power savings
have very less impact since in GPU workload scenario since the GPU
power and utilization dominate the overall runtime efficiency.
This threshold/setting is a performance trade off due to wakeup
latency of these deep idle states.
--Vaidy
More information about the Skiboot
mailing list