[PATCH] ppc64 cpu hotplug

Wed Mar 24 19:15:40 EST 2004

Hi-

> @@ -371,6 +371,7 @@
>  			wait_time = rtas_extended_busy_delay_time(status);
>  			set_current_state(TASK_UNINTERRUPTIBLE);
>  			schedule_timeout(wait_time);
> +			break;
>  		    }
>  		    default:	/* shouldn't happen */
>  			return -ENOMSG;

While I think this is technically correct, I don't understand why
check_dr_state() is there at all.  There is a similar rtas_get_sensor()
function which basically does the same thing, except that it uses udelay
instead of schedule_timeout.  Why don't we use that?

> @@ -382,13 +383,13 @@
>
>  /* Search all cpu DR entities, looking for one which is present.  If
>   * the same hw index as before is available, grab that in preference.a
> - * Match the dr-index to a cpu node in the device tree.  Use the reg
> - * (hw index) from the node to query rtas if the cpu is in a stopped
> - * state.
> + * Match the dr-index to a cpu node in the device tree.  Use the
> + * ibm,ppc-interrupt-server#s (hw index) from the node to query rtas
> + * if the cpu is in a stopped state.
>   */
>  static unsigned int find_physical_cpu_to_start(unsigned int old_hwindex)
>  {
> -	int i, idx;
> +	int i, j, idx;
>  	int count = 0;
>  	int num_addr_cell, num_size_cell, len;
>  	struct device_node *np;
> @@ -426,18 +427,23 @@
>
>  			if (!ireg || ireg[0] != idx)
>  				continue;
> -			ireg = (unsigned int *)get_property(np, "reg", &len);
>
> -			if (!ireg)
> -				continue;
> +			ireg = (unsigned int*) get_property(np, "ibm,ppc-interrupt-server#s", &len);
>
> -			status = query_cpu_stopped(*ireg);
> -			if (status == 0) {
> -				best = *ireg;
> +			if (unlikely(!ireg)){
> +			    ireg = (unsigned int*) get_property(np, "reg", &len);/* fake it with phys id */
> +			    if(!ireg)
> +				continue;
> +			}
> +			for(j = 0; j < (len / sizeof(u32)); j++){
> +			    status = query_cpu_stopped(ireg[j]);
> +			    if (status == 0) {
> +				best = ireg[j];
>  				if (best == old_hwindex) {
> -					of_node_put(np);
> -					goto out;
> +				    of_node_put(np);
> +				    goto out;
>  				}
> +			    }
>  			}
>  		}
>  	}

I do not have any problem (save coding style) with this part of the
patch; however, I do not understand why find_physical_cpu_to_start is
potentially polling every possible drc-index.  If a cpu is present in
the device tree, it belongs to the partition and it is configured
(except when we've *just* released a cpu to the hypervisor - we
currently rely on userspace to remove the node from the device tree -
yuck).  Seems to me that we should simply query each logical cpu for
each cpu device node.  To be safe, we could check the dr state of each
cpu node using its ibm,my-drc-index property.

I am including a patch which implements my suggestions and adapts the
second part of Joel's patch to it:

o  use rtas_get_sensor, ditch check_dr_state
o  poll a drc-index only when it is attached to a known cpu
o  handle SMT cpus properly by using the "ibm,ppc-interrupt-server#s"
property when available instead of the "reg" property

I have tested this on a 2-way Power4 lpar, but I am not able to test the
SMT case at this time.  Patch is against a current-ish Ameslab tree
(2.6.5-rc2).

Nathan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: find_phys_cpu_rework.patch
Type: text/x-patch
Size: 4303 bytes
Desc: not available
Url : http://ozlabs.org/pipermail/linuxppc64-dev/attachments/20040324/bed168cd/attachment.bin