[Skiboot] [PATCH] SLW: Add idle state stop5 for DD2.0 and above

Wed Feb 7 16:55:40 AEDT 2018

Nicholas Piggin <npiggin at gmail.com> writes:
> On Tue, 14 Nov 2017 18:52:09 +1100
> Stewart Smith <stewart at linux.vnet.ibm.com> wrote:
>
>> Akshay Adiga <akshay.adiga at linux.vnet.ibm.com> writes:
>> > Adding stop5 idle state with rough residency and latency numbers.  
>> 
>> How stable has stop5 proved? I gather this patch is an indication that
>> we're fairly confident we have stop5 working well enough for an OS to
>> use?
>> 
>> Is there some way we could write a test from userspace to force a core
>> into stop5 and out again a bunch of times? Maybe disable all other stop
>> states, do nothing on it and then check?
>> 
>
> Yes, disable all other stop states. You can do it at runtime with
> /sys/devices/system/cpu/cpu*/cpuidle/state*/disable
>
> You can then use context switch benchmark or some IO traffic etc
> to really hammer it.

So, I've gotten a skeleton going for an op-test test that does this.

It also turns out you need your kernel bugfix
3ed09c94580de9d5b18cc35d1f97e9f24cd9233b "cpuidle: menu: allow state 0 to be disabled"
or else it's not too useful test, as you end up just testing snooze
rather than any stop state.

I'm now trying to come up with the world's worst context switch/IO
benchmark that can be run in a busybox shell with whatever we build into
petitboot.

Can anyone think of something better than
'taskset -c 6 find / |head -n 200000 > /dev/null'
?
(6 = cpu nr being tested)

Weirdly though, CPU0 seems to get a bunch of accounting done for it over
others, with it 'entering' (according to /sys/..../stateN/usage) the
stop state a lot more than other threads or even other cores....

# CPU 0 entered idle state ['stop4'] 236 times
# CPU 1 entered idle state ['stop4'] 1 times
# CPU 2 entered idle state ['stop4'] 1 times
# CPU 9 entered idle state ['stop4'] 1 times
# CPU 0 entered idle state ['stop5'] 221 times
# CPU 1 entered idle state ['stop5'] 1 times
# CPU 2 entered idle state ['stop5'] 1 times
# CPU 9 entered idle state ['stop5'] 1 times

(this test disables all but the stop state being tested, runs that task
above on it, then measures the difference in 'usage' number for the CPU
we're interested in.

Is this behaviour expected?

-- 
Stewart Smith
OPAL Architect, IBM.