[PATCH 1/1] hotplug-cpu.c: show 'last online CPU' error in dlpar_cpu_remove()

Daniel Henrique Barboza danielhb413 at gmail.com
Tue Mar 23 10:30:36 AEDT 2021



On 3/19/21 8:26 AM, Michael Ellerman wrote:
> Daniel Henrique Barboza <danielhb413 at gmail.com> writes:
>> Ping
>>
>> On 3/5/21 2:38 PM, Daniel Henrique Barboza wrote:
>>> Of all the reasons that dlpar_cpu_remove() can fail, the 'last online
>>> CPU' is one that can be caused directly by the user offlining CPUs
>>> in a partition/virtual machine that has hotplugged CPUs. Trying to
>>> reclaim a hotplugged CPU can fail if the CPU is now the last online in
>>> the system. This is easily reproduced using QEMU [1].
> 
> Sorry, I saw this earlier and never got around to replying.

No problem. Thanks for the review!

> 
> I'm wondering if we neet to catch it earlier, ie. in
> dlpar_offline_cpu().
> 
> By the time we return to dlpar_cpu_remove() we've dropped the
> cpu_add_remove_lock (cpu_maps_update_done), so num_online_cpus() could
> change out from under us, meaning the num_online_cpus() == 1 check might
> trigger incorrectly (or vice versa).
> 
> Something like the patch below (completely untested :D)

Makes sense. I'll try it out to see if it works.

> 
> And writing that patch makes me wonder, is == 1 the right check?
> 
> In most cases we'll remove all but one thread of the core, but we'll
> fail on the last thread. Leaving that core effectively stuck in SMT1. Is
> that useful behaviour? Should we instead check at the start that we can
> remove all threads of the core without going to zero online CPUs?

I think it's ok to allow SMT1 cores, speaking from QEMU perspective.
QEMU does not have a "core hotunplug" operation where the whole core is
hotunplugged at once. The CPU hotplug/unplug operations are done as single
CPU thread add/removal. If the user wants to run 4 cores, all of them with
SMT1, QEMU will allow it.

Libvirt does not operate with the core granularity either - you can specify
the amount of vcpus the guest should run with, and Libvirt will send
hotplug/unplug requests to QEMU to match the desired value. It doesn't
bother with how many threads of a core were offlined or not.


Thanks,


DHB



> 
> cheers
> 
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> index 12cbffd3c2e3..498c22331ac8 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
> @@ -271,6 +271,12 @@ static int dlpar_offline_cpu(struct device_node *dn)
>   			if (!cpu_online(cpu))
>   				break;
>   
> +			if (num_online_cpus() == 1) {
> +				pr_warn("Unable to remove last online CPU %pOFn\n", dn);
> +				rc = EBUSY;
> +				goto out_unlock;
> +			}
> +
>   			cpu_maps_update_done();
>   			rc = device_offline(get_cpu_device(cpu));
>   			if (rc)
> @@ -283,6 +289,7 @@ static int dlpar_offline_cpu(struct device_node *dn)
>   				thread);
>   		}
>   	}
> +out_unlock:
>   	cpu_maps_update_done();
>   
>   out:
> 


More information about the Linuxppc-dev mailing list