[PATCH] powerpc: ONLINE to OFFLINE CPU state transition during removal

Vaidyanathan Srinivasan svaidy at linux.vnet.ibm.com
Thu Aug 5 23:31:09 EST 2010


* Nathan Fontenot <nfont at austin.ibm.com> [2010-07-26 14:13:35]:

> On 07/22/2010 11:13 PM, Vaidyanathan Srinivasan wrote:
> > * Robert Jennings <rcj at linux.vnet.ibm.com> [2010-07-22 21:43:44]:
> > 
> >> If a CPU remove is attempted using the 'release' interface on hardware
> >> which supports extended cede, the CPU will be put in the INACTIVE state
> >> rather than the OFFLINE state due to the default preferred_offline_state
> >> in that situation.  In the INACTIVE state it will fail to be removed.
> >>
> >> This patch changes the preferred offline state to OFFLINE when an CPU is
> >> in the ONLINE state.  After cpu_down() is called in dlpar_offline_cpu()
> >> the CPU will be OFFLINE and CPU removal can continue.
> > 
> > Hi Robert,
> > 
> > Thanks for the patch.  In dlpar operation, we would offline the CPU
> > first using the sysfs online file and then write to the sysfs release
> > file to complete the sequence right?  The current code in
> > dlpar_offline_cpu() would work as long as the cpu is in either
> > inactive state or offline state (in case of unsupported platform).
> > 
> > Is the dlpar tools being changed to complete the operation with one
> > sysfs write to release file?
> 
> The dlpar tools were updated so that a single write to the 'release' file
> would offline the cpu and remove it from the system.  Given this, I think
> Robert's patch should go forward to maintain compatability.

Hi Nathan,

Thanks for clarifying.  One concern that I have is that this change
could race with sysfs write to cpuN/online file.

dlpar_offline_cpu() is called with cpu_hotplug_driver_lock() held so
the sysfs operation on online file should wait.  However by the time
the cpu_hotplug_driver_unlock() happens the dlpar operation should
have completed and we should not be able to online or offline the cpu
from sysfs online file.

I wanted to confirm whether any concurrent sysfs write will not cause
the dlpar operation to fail in the middle of the operation.

The previous code will work fine for platforms not supporting
cede_offline where we have only two states for the cpu.

When cede_offline is supported, we have three states and the state
transitions are controlled by online file and the release file in two
steps so that the failures can be retried in user space.

This patch will work and I do not see any problem as such, but we
still need to carefully evaluate the retry and error handling paths
before switching over to a single sysfs write based dlpar operation.

--Vaidy



More information about the Linuxppc-dev mailing list