[Skiboot] [PATCH] occ: Filter out entries from Pmin to Pmax in pstate table
svaidy at linux.vnet.ibm.com
Mon Apr 18 16:48:06 AEST 2016
* Balbir Singh <bsingharora at gmail.com> [2016-04-18 16:03:26]:
> On 18/04/16 15:32, Shilpasri G Bhat wrote:
> > Hi Balbir,
> > On 04/18/2016 10:39 AM, Balbir Singh wrote:
> >> On 16/04/16 03:12, Shilpasri G Bhat wrote:
> >>> Parse the entire pstate table provided by OCC and filter out the
> >>> entries that are outside the Pmax and Pmin limits. This can
> >>> occur when turbo mode is disabled and OCC limits the Pmax to
> >>> nominal pstate, but includes turbo pstates in the pstate table.
> >>> We end up with wrong pstates in such cases if we do not parse
> >>> the pstate table to filter out the correct range.
> >>> Signed-off-by: Shilpasri G Bhat <shilpa.bhat at linux.vnet.ibm.com>
> >> Can turbo mode be turned on/off without rebooting? We seem to be doing
> >> this once during boot. Shouldn't the OS handle the pmax/pmin checks?
> > No. Turbo mode is turned on/off during boot. This issue can not be solely
> > handled in OS.
> > The issue here is skiboot parses the pstate table upto nr_pstates and
> > nr_pstates = pmax - pmin + 1
> > Now if turbo is disabled, we have pmax = nominal_pstate and pstate table
> > contains pstate entries from turbo to pmin. And skiboot parses the upper portion
> > of nr_pstates entries which is starting from turbo and not the actual pmax which
> > is nominal pstate. Thus resulting in exporting smaller set of valid pstates to
> > the host. So we need skiboot to filter out the correct pstate range.
> Fair point!
> I was wondering if we just expose all the pstates including p_min and p_max and
> nr_pstates, the OS might be better at filtering states and implementing the
> right thing in the future without requiring a reboot (may be in the future).
> We might be able to show turbo p-states and show them as disabled in the OS
There are two scenarios of turbo mode control:
(a) OS driven policy:
This is the case you are talking about where we would like Linux OS
cpufreq driver to control the limits of cpu frequency scaling in order
to drive responsiveness and efficiency for workloads.
We can do this today by setting the max scaling frequency limit in
ondemand and performance governor and we can have tuned profiles to
make this simple for end users.
We have the full cpu frequency range including exposed to cpu-freq
driver and also the min/nominal/max frequency is available in sysfs to
set the required policy.
We can disable turbo mode today by setting max scaling frequency to
the current nominal frequency.
(b) Firmware driven policy:
This is the case Shilpa's fix addresses. The goal here is to have an
equivalent of BIOS based settings that is consistent across different
OS instances and limits the policies that OS can use.
Turbo mode disable in scenario (a) can be done by setting cpu-freq
driver limits. However for case (b) we need to filter out the allowed
range and tell Linux only the PStates that Linux will be allowed to
OCC will limit the range and today if the PState requested by Linux is
not granted by OCC then we assume a power cap or throttle situation
and do not expect to have a cpu freq policy conflict between OCC and
More information about the Skiboot