[PATCH v2] powerpc/pseries/vas: Use usleep_range() to support HCALL delay

Nathan Lynch nathanl at linux.ibm.com
Thu Nov 30 12:43:47 AEDT 2023


Haren Myneni <haren at linux.ibm.com> writes:
> VAS allocate, modify and deallocate HCALLs returns
> H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC for busy
> delay and expects OS to reissue HCALL after that delay. But using
> msleep() will often sleep at least 20 msecs even though the
> hypervisor expects to reissue these HCALLs after 1 or 10msecs.

I would word this as "the architecture suggests that the OS reissue
these [...]" instead of framing it as something the platform "expects".

> It might cause these HCALLs takes longer when multiple threads
> issue open or close VAS windows simultaneously.

This is imprecise. Over-sleeping by the OS doesn't cause individual
hcalls to take longer. It is more accurate to say that the higher-level
operation (allocate, modify, free) may take longer than necessary in
cases where the OS must retry the hcalls involved.

> So instead of msleep(), use usleep_range() to ensure sleep with
> the expected value before issuing HCALL again.
>
> Signed-off-by: Haren Myneni <haren at linux.ibm.com>
> Suggested-by: Nathan Lynch <nathanl at linux.ibm.com>
>
> ---
> v1 -> v2:
> - Use usleep_range instead of using RTAS sleep routine as
>   suggested by Nathan
> ---
>  arch/powerpc/platforms/pseries/vas.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/platforms/pseries/vas.c b/arch/powerpc/platforms/pseries/vas.c
> index 71d52a670d95..bade4402741f 100644
> --- a/arch/powerpc/platforms/pseries/vas.c
> +++ b/arch/powerpc/platforms/pseries/vas.c
> @@ -36,9 +36,31 @@ static bool migration_in_progress;
>  
>  static long hcall_return_busy_check(long rc)
>  {
> +	unsigned int ms;

This should move down into the H_IS_LONG_BUSY() block if it's not used
outside of it.

> +
>  	/* Check if we are stalled for some time */
>  	if (H_IS_LONG_BUSY(rc)) {
> -		msleep(get_longbusy_msecs(rc));
> +		ms = get_longbusy_msecs(rc);
> +		/*
> +		 * Allocate, Modify and Deallocate HCALLs returns
> +		 * H_LONG_BUSY_ORDER_1_MSEC or H_LONG_BUSY_ORDER_10_MSEC
> +		 * for the long delay. So the delay should always be 1
> +		 * or 10msecs, but sleeps 1msec in case if the long
> +		 * delay is > H_LONG_BUSY_ORDER_10_MSEC.
> +		 */
> +		if (ms > 10)
> +			ms = 1;

It's strange to coerce ms to 1 when it's greater than 10. Just clamp it
to 10, e.g.

                ms = clamp(get_longbusy_msecs(rc), 1, 10);

> +
> +		/*
> +		 * msleep() will often sleep at least 20 msecs even
> +		 * though the hypervisor expects to reissue these
> +		 * HCALLs after 1 or 10msecs. So use usleep_range()
> +		 * to sleep with the expected value.
> +		 *
> +		 * See Documentation/timers/timers-howto.rst on using
> +		 * the value range in usleep_range().
> +		 */
> +		usleep_range(ms * 100, ms * 1000);

If there's going to be commentary here I think it should just explain
why potentially sleeping for less than the suggested time is OK. There
is wording you can crib in rtas_busy_delay().


>  		rc = H_BUSY;
>  	} else if (rc == H_BUSY) {
>  		cond_resched();
> -- 
> 2.26.3


More information about the Linuxppc-dev mailing list