Patch: cpu utilization monitor.

ahuja at austin.ibm.com ahuja at austin.ibm.com
Thu Mar 18 08:57:42 EST 2004


Shailab,

This patch only collects. Does no reporting at all.

We can work out stuff offline so that we dont duplicate any efforts at
all for reporting any data.

Cheers,
Manish


> Manish/Linas, if you're writing the entity to determine the real
> fraction, there's no duplication of effort. If you're getting into
> reporting it to higher level users (human or software), you might be -
> we currently have two kernel-user paths for sending such
> info up to the user (one for manual users of CKRM, one for middleware).
> We'll be doing a code drop on lkml in the next day or
> two so you'll be able to determine for yourself.
>
> Up in user space, CKRM's tooling is rudimentary. With the new filesystem
> API that we're using, its even more likely we'll be leaning towards
> scripts initially.
>
> Naturally, we'd be happy to discuss all this further. The CKRM project
> has quite a few high-priority stuff on its plate that integration with
> other projects (such as cpumemsets for NUMA, or yours for LPAR) isn't
> important yet but if we keep in sync at a high level,  it may be
> possible to avoid duplication/incompatible design choices.
>
> Hope this helps,
> Shailabh
>
> ahuja at austin.ibm.com wrote:
>
> >Thanks for the comments everyone.
> >
> >Like linas said earlier, the value getting reported by OS whether the cpu
> >is 100% busy or 50% busy does not hold any relation to the actual physical
> >CPU allocated to it anymore.
> >
> >I am attempting to normalize the value that the OS reports to the actual
> >cpu use and give a more accurate picture to other tools/user space. Now
> >there are couple of different requirements and I hope to get to all of
> >them as this progresses.
> >
> >I will try and rectify the code from the comments I have received so far.
> >I did give CKRM a cursory glance, not sure that I am duplicating effort
> >here. But let me look further on that.
> >
> >Thanks,
> >Manish
> >
> >
> >On Wed, 17 Mar 2004, Mike Kravetz wrote:
> >
> >
> >
> >>On Wed, Mar 17, 2004 at 11:13:59AM -0800, Dave Hansen wrote:
> >>
> >>
> >>>On Wed, 2004-03-17 at 10:56, linas at austin.ibm.com wrote:
> >>>
> >>>
> >>>>This patch differs from other efforts in that it gets data directly from
> >>>>the hypervisor.  Think multiple virtual cpus running on one physical cpu.
> >>>>The traditional tools, whether CKRM or top or vmstat, are blind to the
> >>>>fact that any given 'virtual cpu' might be getting only 10% of the physical
> >>>>cycles in one hypervisor time-slice, and 90% in another.
> >>>>
> >>>>Very crudely, its sort-of like VM on the 390/zSeries.  Your kernel may
> >>>>think its 100% busy, but in fact it might be getting only 1% of the actual
> >>>>physical hardware cycles.  The goal here is to be able to report the
> >>>>fraction of the total physical cycles, and do so on a HZ or even sub-HZ
> >>>>level of granularity.
> >>>>
> >>>>
> >>>But, the number is still just another performance counter, right?  Is
> >>>the interface to fetch it the same as the other CPU performance
> >>>counters?
> >>>
> >>>I think what Greg was getting at is that CKRM aims to be able to make
> >>>resource decisions based on data it gets from all kinds of sources,
> >>>including performance counters.  If you export this 'virtual cpu' slice
> >>>in the same way that other CKRM-handled data are, then you can probably
> >>>access it in whatever way you wanted, and you get the code reuse benefit
> >>>of using the rest of the CKRM work.  Shailabh, am I on the right track
> >>>here?  I'm kinda guessing at what the CKRM goals are here.
> >>>
> >>>What is the planned use of this counter?  Will it simply be exported to
> >>>userspace, or will the kernel need it internally for something?
> >>>
> >>>
> >>>
> >>Actually, this type of data sounds like something that (forgive me
> >>for mentioning this!!!) the IBM eWLM product would want to know.
> >>I don't think CKRM, or the OS can do much with this type of data
> >>except report it for further analysis.  More interesting is what
> >>something that let's say 'controls the entire machine' can do with
> >>this data.  For example, one OS isn't getting enough CPU cycles
> >>and another OS has excess cycles.  Let's turn the knobs to balance
> >>things out at the machine/hypervisor level.
> >>
> >>Perhaps this is what was meant by Linas's original reference to
> >>'on demand'?
> >>
> >>--
> >>Mike
> >>
> >>
> >>
> >>
> >>
> >
> >
> >
> >
>
>
>
>


** Sent via the linuxppc64-dev mail list. See http://lists.linuxppc.org/





More information about the Linuxppc64-dev mailing list