[PATCH 13/29] powerpc/pseries/mobility: use stop_machine for join/suspend

Sat Dec 5 22:03:02 AEDT 2020

Nathan Lynch <nathanl at linux.ibm.com> writes:
> Hi Michael,
>
> Michael Ellerman <mpe at ellerman.id.au> writes:
>> Nathan Lynch <nathanl at linux.ibm.com> writes:
>>> The partition suspend sequence as specified in the platform
>>> architecture requires that all active processor threads call
>>> H_JOIN, which:
>> ...
>>> diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c
>>> index 1b8ae221b98a..44ca7d4e143d 100644
>>> --- a/arch/powerpc/platforms/pseries/mobility.c
>>> +++ b/arch/powerpc/platforms/pseries/mobility.c
>>> @@ -412,6 +414,128 @@ static int wait_for_vasi_session_suspending(u64 handle)
>> ...
>>
>>> +
>>> +static int do_join(void *arg)
>>> +{
>>> +	atomic_t *counter = arg;
>>> +	long hvrc;
>>> +	int ret;
>>> +
>>> +	/* Must ensure MSR.EE off for H_JOIN. */
>>> +	hard_irq_disable();
>>
>> Didn't stop_machine() already do that for us?
>>
>> In the state machine in multi_cpu_stop().
>
> Yes, but I didn't want to rely on something that seems like an
> implementation detail of stop_machine(). I assumed it's benign and in
> keeping with hard_irq_disable()'s intended semantics to make multiple
> calls to it within a critical section.

OK. I think it's part of the contract of stop_machine() these days, but
you're right hard_irq_disable() can be called multiple times, so we may
as well leave it there as insurance/documentation.

>>> +	hvrc = plpar_hcall_norets(H_JOIN);
>>> +
>>> +	switch (hvrc) {
>>> +	case H_CONTINUE:
>>> +		/*
>>> +		 * All other CPUs are offline or in H_JOIN. This CPU
>>> +		 * attempts the suspend.
>>> +		 */
>>> +		ret = do_suspend();
>>> +		break;
>>> +	case H_SUCCESS:
>>> +		/*
>>> +		 * The suspend is complete and this cpu has received a
>>> +		 * prod.
>>> +		 */
>>> +		ret = 0;
>>> +		break;
>>> +	case H_BAD_MODE:
>>> +	case H_HARDWARE:
>>> +	default:
>>> +		ret = -EIO;
>>> +		pr_err_ratelimited("H_JOIN error %ld on CPU %i\n",
>>> +				   hvrc, smp_processor_id());
>>> +		break;
>>> +	}
>>> +
>>> +	if (atomic_inc_return(counter) == 1) {
>>> +		pr_info("CPU %u waking all threads\n", smp_processor_id());
>>> +		prod_others();
>>> +	}
>>
>> Do we even need the counter? IIUC only one CPU receives H_CONTINUE. So
>> couldn't we just have that CPU do the prodding of others?
>
> CPUs may exit H_JOIN due to system reset interrupt at any time, and
> H_JOIN may return H_HARDWARE to a caller after other CPUs have entered
> the join state successfully. In these cases the counter ensures exactly
> one thread performs the prod sequence.

OK.

>>> +	/*
>>> +	 * Execution may have been suspended for several seconds, so
>>> +	 * reset the watchdog.
>>> +	 */
>>> +	touch_nmi_watchdog();
>>> +	return ret;
>>> +}
>>> +
>>> +static int pseries_migrate_partition(u64 handle)
>>> +{
>>> +	atomic_t counter = ATOMIC_INIT(0);
>>> +	int ret;
>>> +
>>> +	ret = wait_for_vasi_session_suspending(handle);
>>> +	if (ret)
>>> +		goto out;
>>
>> Direct return would be clearer IMHO.
>
> OK, I can change this.

Thanks.

cheers