[PATCH V2 2/2] mm: add FAULT_AROUND_ORDER Kconfig paramater for powerpc

Fri Apr 4 18:10:59 EST 2014

* Ingo Molnar <mingo at kernel.org> wrote:

> * Madhavan Srinivasan <maddy at linux.vnet.ibm.com> wrote:
> 
> > Performance data for different FAULT_AROUND_ORDER values from 4 
> > socket Power7 system (128 Threads and 128GB memory) is below. perf 
> > stat with repeat of 5 is used to get the stddev values. This patch 
> > create FAULT_AROUND_ORDER Kconfig parameter and defaults it to 3 
> > based on the performance data.
> > 
> > FAULT_AROUND_ORDER      Baseline        1               3               4               5               7
> > 
> > Linux build (make -j64)
> > minor-faults            7184385         5874015         4567289         4318518         4193815         4159193
> > times in seconds        61.433776136    60.865935292    59.245368038    60.630675011    60.56587624     59.828271924
> >  stddev for time	( +-  1.18% )	( +-  1.78% )	( +-  0.44% )	( +-  2.03% )	( +-  1.66% )	( +-  1.45% )
> 
> Ok, this is better, but it is still rather incomplete statistically, 
> please also calculate the percentage difference to baseline, so that 
> the stddev becomes meaningful and can be compared to something!
> 
> As an example I did this for the first line of measurements (all 
> errors in the numbers are mine, this was done manually), and it 
> gives:
> 
> >  stddev for time   ( +-  1.18% ) ( +-  1.78% ) ( +-  0.44% ) ( +-  2.03% ) ( +-  1.66% ) ( +-  1.45% )
>                                         +0.9%         +3.5%         +1.3%         +1.4%         +2.6%
> 
> This shows that there is probably a statistically significant 
> (positiv) effect from the change, but from these numbers alone I 
> would not draw any quantitative (sizing, tuning) conclusions, 
> because in 3 out of 5 cases the stddev was larger than the effect, 
> so the resulting percentages are not comparable.

Also note that because we calculate the percentage by dividing result 
with baseline, the stddev of the two values roughly adds up. So for 
example the second column the true noise is around 1.5%, not 0.4%

So for good sizing decisions the stddev must be 'comfortably' below 
the effect. (or sizing should be done based on the other workloads yu 
tested, I have not checked them.)

It also makes sense to run more measurements to reduce the stddev of 
the baseline. So if each measurement is run 3 times then it makes 
sense to run the baseline 6 times, this gives a ~30% improvement in 
the confidence of our result, at just a small increase in test time.

[ For such cases it might also make sense to script all of that, 
  combined with a debug patch that puts the tuned fault-around value 
  into a dynamic knob in /proc/sys/, so that you can run the full 
  measurement in a single pass, with no reboot and with no human 
  intervention. ]

Thanks,

	Ingo