[RFC] powerpc: add support for new hcall H_BEST_ENERGY
Vaidyanathan Srinivasan
svaidy at linux.vnet.ibm.com
Wed Apr 7 14:57:18 EST 2010
* Benjamin Herrenschmidt <benh at kernel.crashing.org> [2010-04-07 12:04:49]:
> On Wed, 2010-03-03 at 23:48 +0530, Vaidyanathan Srinivasan wrote:
> > Hi,
>
> > diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
> > index 03dd6a2..fbd93e3 100644
> > --- a/arch/powerpc/kernel/setup-common.c
> > +++ b/arch/powerpc/kernel/setup-common.c
> > @@ -359,6 +359,8 @@ void __init check_for_initrd(void)
> > #ifdef CONFIG_SMP
> >
> > int threads_per_core, threads_shift;
> > +EXPORT_SYMBOL_GPL(threads_per_core);
>
> While I agree it should be exported for the APIs in cputhread.h to be
> usable from a module, this variable shouldn't be -used- directly, but
> only via the API functions in there.
Ok agreed. This will be the first module to use this and hence will
have to implement the API and export the function.
> .../...
>
> > +
> > +#define MODULE_VERS "1.0"
> > +#define MODULE_NAME "pseries_energy"
> > +
> > +/* Helper Routines to convert between drc_index to cpu numbers */
> > +
> > +static u32 cpu_to_drc_index(int cpu)
> > +{
> > + struct device_node *dn = NULL;
> > + const int *indexes;
> > + int i;
> > + dn = of_find_node_by_name(dn, "cpus");
>
> Check the result. Also that's not a nice way to do that, you should look
> for /cpus by path I reckon.
I will check the return code, but why do you feel getting the node by
path is better? Is there any race, or we may have duplicate "cpus"
node?
> > + indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
>
> Check the result here too.
Yes, will do.
> > + /* Convert logical cpu number to core number */
> > + i = cpu / threads_per_core;
>
> Don't use that variable as I said earlier. Use cpu_thread_to_core()
Ack
> > + /*
> > + * The first element indexes[0] is the number of drc_indexes
> > + * returned in the list. Hence i+1 will get the drc_index
> > + * corresponding to core number i.
> > + */
> > + WARN_ON(i > indexes[0]);
> > + return indexes[i + 1];
> > +}
> > +
> > +static int drc_index_to_cpu(u32 drc_index)
> > +{
> > + struct device_node *dn = NULL;
> > + const int *indexes;
> > + int i, cpu;
> > + dn = of_find_node_by_name(dn, "cpus");
> > + indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
>
> Same comments, check results and use /cpus path
Sure. Will do.
> > + /*
> > + * First element in the array is the number of drc_indexes
> > + * returned. Search through the list to find the matching
> > + * drc_index and get the core number
> > + */
> > + for (i = 0; i < indexes[0]; i++) {
> > + if (indexes[i + 1] == drc_index)
> > + break;
> > + }
> > + /* Convert core number to logical cpu number */
> > + cpu = i * threads_per_core;
>
> Here's more annoying as we don't have an API in cputhread.h for that.
>
> In fact, we have a confusion in there since cpu_first_thread_in_core()
> doesn't actually take a core number ...
>
> I'm going to recommend a complicated approach but that's the best in the
> long run:
>
> - First do a patch that renames cpu_first_thread_in_core() to something
> clearer like cpu_first_thread_in_same_core() or cpu_leftmost_sibling()
> (I think I prefer the later). Rename the few users in
> arch/powerpc/mm/mmu_context_nohash.c and arch/powerpc/kernel/smp.c
>
> - Then do a patch that adds a cpu_first_thread_of_core() that takes a
> core number, basically does:
>
> static inline int cpu_first_thread_of_core(int core)
> {
> return core << threads_shift;
> }
>
> - Then add your modified H_BEST_ENERGY patch on top of it using the
> above two as pre-reqs :-)
Ok, I can do that change and then base this patch on the new API.
> > + return cpu;
> > +}
> > +
> > +/*
> > + * pseries hypervisor call H_BEST_ENERGY provides hints to OS on
> > + * preferred logical cpus to activate or deactivate for optimized
> > + * energy consumption.
> > + */
> > +
> > +#define FLAGS_MODE1 0x004E200000080E01
> > +#define FLAGS_MODE2 0x004E200000080401
> > +#define FLAGS_ACTIVATE 0x100
> > +
> > +static ssize_t get_best_energy_list(char *page, int activate)
> > +{
> > + int rc, cnt, i, cpu;
> > + unsigned long retbuf[PLPAR_HCALL9_BUFSIZE];
> > + unsigned long flags = 0;
> > + u32 *buf_page;
> > + char *s = page;
> > +
> > + buf_page = (u32 *) get_zeroed_page(GFP_KERNEL);
> > +
> Why that blank line ?
I will remove them
> > + if (!buf_page)
> > + return 0;
>
> So here you return 0 instead of -ENOMEM
Will fix. -ENOMEM is correct.
> > + flags = FLAGS_MODE1;
> > + if (activate)
> > + flags |= FLAGS_ACTIVATE;
> > +
> > + rc = plpar_hcall9(H_BEST_ENERGY, retbuf, flags, 0, __pa(buf_page),
> > + 0, 0, 0, 0, 0, 0);
> > +
> > +
>
> Again, no need for a blank line before xx = foo() and if (xx)
Ack.
> > if (rc != H_SUCCESS) {
> > + free_page((unsigned long) buf_page);
> > + return -EINVAL;
> > + }
>
> And here you return an error code. Which one is right ?
Returning an error code is correct to user space. The call will check
for return value on read while getting the list.
Thanks for the detailed review. I will fix the issues that you
pointed out and post the next version.
--Vaidy
More information about the Linuxppc-dev
mailing list