[RFC] powerpc: add support for new hcall H_BEST_ENERGY

Vaidyanathan Srinivasan svaidy at linux.vnet.ibm.com
Wed Apr 7 14:57:18 EST 2010


* Benjamin Herrenschmidt <benh at kernel.crashing.org> [2010-04-07 12:04:49]:

> On Wed, 2010-03-03 at 23:48 +0530, Vaidyanathan Srinivasan wrote:
> > Hi,
> 
> > diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
> > index 03dd6a2..fbd93e3 100644
> > --- a/arch/powerpc/kernel/setup-common.c
> > +++ b/arch/powerpc/kernel/setup-common.c
> > @@ -359,6 +359,8 @@ void __init check_for_initrd(void)
> >  #ifdef CONFIG_SMP
> >  
> >  int threads_per_core, threads_shift;
> > +EXPORT_SYMBOL_GPL(threads_per_core);
> 
> While I agree it should be exported for the APIs in cputhread.h to be
> usable from a module, this variable shouldn't be -used- directly, but
> only via the API functions in there.

Ok agreed.  This will be the first module to use this and hence will
have to implement the API and export the function.

>  .../...
> 
> > +
> > +#define MODULE_VERS "1.0"
> > +#define MODULE_NAME "pseries_energy"
> > +
> > +/* Helper Routines to convert between drc_index to cpu numbers */
> > +
> > +static u32 cpu_to_drc_index(int cpu)
> > +{
> > +	struct device_node *dn = NULL;
> > +	const int *indexes;
> > +	int i;
> > +	dn = of_find_node_by_name(dn, "cpus");
> 
> Check the result. Also that's not a nice way to do that, you should look
> for /cpus by path I reckon.

I will check the return code, but why do you feel getting the node by
path is better?  Is there any race, or we may have duplicate "cpus"
node?

> > +	indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
> 
> Check the result here too.

Yes, will do.

> > +	/* Convert logical cpu number to core number */
> > +	i = cpu / threads_per_core;
> 
> Don't use that variable as I said earlier. Use cpu_thread_to_core()

Ack

> > +	/*
> > +	 * The first element indexes[0] is the number of drc_indexes
> > +	 * returned in the list.  Hence i+1 will get the drc_index
> > +	 * corresponding to core number i.
> > +	 */
> > +	WARN_ON(i > indexes[0]);
> > +	return indexes[i + 1];
> > +}
> > +
> > +static int drc_index_to_cpu(u32 drc_index)
> > +{
> > +	struct device_node *dn = NULL;
> > +	const int *indexes;
> > +	int i, cpu;
> > +	dn = of_find_node_by_name(dn, "cpus");
> > +	indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
> 
> Same comments, check results and use /cpus path

Sure. Will do.

> > +	/*
> > +	 * First element in the array is the number of drc_indexes
> > +	 * returned.  Search through the list to find the matching
> > +	 * drc_index and get the core number
> > +	 */
> > +	for (i = 0; i < indexes[0]; i++) {
> > +		if (indexes[i + 1] == drc_index)
> > +			break;
> > +	}
> > +	/* Convert core number to logical cpu number */
> > +	cpu = i * threads_per_core;
> 
> Here's more annoying as we don't have an API in cputhread.h for that.
> 
> In fact, we have a confusion in there since cpu_first_thread_in_core()
> doesn't actually take a core number ...
> 
> I'm going to recommend a complicated approach but that's the best in the
> long run:
> 
>  - First do a patch that renames cpu_first_thread_in_core() to something
> clearer like cpu_first_thread_in_same_core() or cpu_leftmost_sibling()
> (I think I prefer the later). Rename the few users in
> arch/powerpc/mm/mmu_context_nohash.c and arch/powerpc/kernel/smp.c
> 
>  - Then do a patch that adds a cpu_first_thread_of_core() that takes a
> core number, basically does:
> 
> static inline int cpu_first_thread_of_core(int core)
> {
> 	return core << threads_shift;
> }
> 
>  - Then add your modified H_BEST_ENERGY patch on top of it using the
> above two as pre-reqs :-)

Ok, I can do that change and then base this patch on the new API.

> > +	return cpu;
> > +}
> > +
> > +/*
> > + * pseries hypervisor call H_BEST_ENERGY provides hints to OS on
> > + * preferred logical cpus to activate or deactivate for optimized
> > + * energy consumption.
> > + */
> > +
> > +#define FLAGS_MODE1	0x004E200000080E01
> > +#define FLAGS_MODE2	0x004E200000080401
> > +#define FLAGS_ACTIVATE  0x100
> > +
> > +static ssize_t get_best_energy_list(char *page, int activate)
> > +{
> > +	int rc, cnt, i, cpu;
> > +	unsigned long retbuf[PLPAR_HCALL9_BUFSIZE];
> > +	unsigned long flags = 0;
> > +	u32 *buf_page;
> > +	char *s = page;
> > +
> > +	buf_page = (u32 *) get_zeroed_page(GFP_KERNEL);
> > +
> Why that blank line ?

I will remove them

> > +	if (!buf_page)
> > +		return 0;
> 
> So here you return 0 instead of -ENOMEM

Will fix.  -ENOMEM is correct.

> > +	flags = FLAGS_MODE1;
> > +	if (activate)
> > +		flags |= FLAGS_ACTIVATE;
> > +
> > +	rc = plpar_hcall9(H_BEST_ENERGY, retbuf, flags, 0, __pa(buf_page),
> > +				0, 0, 0, 0, 0, 0);
> > +
> > +	
> 
> Again, no need for a blank line before xx = foo() and if (xx)

Ack.

> > if (rc != H_SUCCESS) {
> > +		free_page((unsigned long) buf_page);
> > +		return -EINVAL;
> > +	}
> 
> And here you return an error code. Which one is right ?

Returning an error code is correct to user space.  The call will check
for return value on read while getting the list.

Thanks for the detailed review.  I will fix the issues that you
pointed out and post the next version.

--Vaidy



More information about the Linuxppc-dev mailing list