[PATCH v2] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

Haren Myneni haren at linux.ibm.com
Fri Dec 1 14:56:05 AEDT 2023



On 11/29/23 5:43 PM, Nathan Lynch wrote:
> Haren Myneni <haren at linux.ibm.com> writes:
>> VAS allocate, modify and deallocate HCALLs returns
>> H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
>> delay and expects OS to reissue HCALL after that delay. But using
>> msleep() will often sleep at least 20 msecs even though the
>> hypervisor expects to reissue these HCALLs after 1 or 10msecs.
> 
> I would word this as "the architecture suggests that the OS reissue
> these [...]" instead of framing it as something the platform "expects".
> 
>> It might cause these HCALLs takes longer when multiple threads
>> issue open or close VAS windows simultaneously.
> 
> This is imprecise. Over-sleeping by the OS doesn't cause individual
> hcalls to take longer. It is more accurate to say that the higher-level
> operation (allocate, modify, free) may take longer than necessary in
> cases where the OS must retry the hcalls involved.

Correct, takes longer with multiple threads opening/closing windows. I 
will make it clear.

> 
>> So instead of msleep(), use usleep_range() to ensure sleep with
>> the expected value before issuing HCALL again.
>>
>> Signed-off-by: Haren Myneni <haren at linux.ibm.com>
>> Suggested-by: Nathan Lynch <nathanl at linux.ibm.com>
>>
>> ---
>> v1 -> v2:
>> - Use usleep_range instead of using RTAS sleep routine as
>>    suggested by Nathan
>> ---
>>   arch/powerpc/platforms/pseries/vas.c | 24 +++++++++++++++++++++++-
>>   1 file changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/vas.c b/arch/powerpc/platforms/pseries/vas.c
>> index 71d52a670d95..bade4402741f 100644
>> --- a/arch/powerpc/platforms/pseries/vas.c
>> +++ b/arch/powerpc/platforms/pseries/vas.c
>> @@ -36,9 +36,31 @@ static bool migration_in_progress;
>>   
>>   static long hcall_return_busy_check(long rc)
>>   {
>> +	unsigned int ms;
> 
> This should move down into the H_IS_LONG_BUSY() block if it's not used
> outside of it.
> 
>> +
>>   	/* Check if we are stalled for some time */
>>   	if (H_IS_LONG_BUSY(rc)) {
>> -		msleep(get_longbusy_msecs(rc));
>> +		ms = get_longbusy_msecs(rc);
>> +		/*
>> +		 * Allocate, Modify and Deallocate HCALLs returns
>> +		 * H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
>> +		 * for the long delay. So the delay should always be 1
>> +		 * or 10msecs, but sleeps 1msec in case if the long
>> +		 * delay is > H_LONG_BUSY_ORDER_10_MSEC.
>> +		 */
>> +		if (ms > 10)
>> +			ms = 1;
> 
> It's strange to coerce ms to 1 when it's greater than 10. Just clamp it
> to 10, e.g.
> 
>                  ms = clamp(get_longbusy_msecs(rc), 1, 10);

Sure, these HCALLs should not return > H_LONG_BUSY_ORDER_10_MSEC.

> 
>> +
>> +		/*
>> +		 * msleep() will often sleep at least 20 msecs even
>> +		 * though the hypervisor expects to reissue these
>> +		 * HCALLs after 1 or 10msecs. So use usleep_range()
>> +		 * to sleep with the expected value.
>> +		 *
>> +		 * See Documentation/timers/timers-howto.rst on using
>> +		 * the value range in usleep_range().
>> +		 */
>> +		usleep_range(ms * 100, ms * 1000);
> 
> If there's going to be commentary here I think it should just explain
> why potentially sleeping for less than the suggested time is OK. There
> is wording you can crib in rtas_busy_delay().
> 
> 
>>   		rc = H_BUSY;
>>   	} else if (rc == H_BUSY) {
>>   		cond_resched();
>> -- 
>> 2.26.3


More information about the Linuxppc-dev mailing list