[PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas()
Andrew Donnellan
ajd at linux.ibm.com
Thu Mar 23 17:26:29 AEDT 2023
On Mon, 2023-03-06 at 15:33 -0600, Nathan Lynch via B4 Relay wrote:
> From: Nathan Lynch <nathanl at linux.ibm.com>
>
> The kernel can handle retrying RTAS function calls in response to
> -2/990x in the sys_rtas() handler instead of relaying the
> intermediate
> status to user space.
>
> Justifications:
>
> * Currently it's nondeterministic and quite variable in practice
> whether a retry status is returned for any given invocation of
> sys_rtas(). Therefore user space code cannot be expecting a retry
> result without already being broken.
>
> * This tends to significantly reduce the total number of system calls
> issued by programs such as drmgr which make use of sys_rtas(),
> improving the experience of tracing and debugging such
> programs. This is the main motivation for me: I think this change
> will make it easier for us to characterize current sys_rtas() use
> cases as we move them to other interfaces over time.
>
> * It reduces the number of opportunities for user space to leave
> complex operations, such as those associated with DLPAR, incomplete
> and diffcult to recover.
>
> * We can expect performance improvements for existing sys_rtas()
> users, not only because of overall reduction in the number of
> system
> calls issued, but also due to the better handling of -2/990x in the
> kernel. For example, librtas still sleeps for 1ms on -2, which is
> completely unnecessary.
Would be good to see this fixed on the librtas side.
>
> Performance differences for PHB add and remove on a small P10 PowerVM
> partition are included below. For add, elapsed time is slightly
> reduced. For remove, there are more significant improvements: the
> number of context switches is reduced by an order of magnitude, and
> elapsed time is reduced by over half.
>
> (- before, + after):
>
> Performance counter stats for 'drmgr -c phb -a -s PHB 23' (5 runs):
>
> - 1,847.58 msec task-clock # 0.135
> CPUs utilized ( +- 14.15% )
> - 10,867 cs # 9.800
> K/sec ( +- 14.14% )
> + 1,901.15 msec task-clock # 0.148
> CPUs utilized ( +- 14.13% )
> + 10,451 cs # 9.158
> K/sec ( +- 14.14% )
>
> - 13.656557 +- 0.000124 seconds time elapsed ( +- 0.00% )
> + 12.88080 +- 0.00404 seconds time elapsed ( +- 0.03% )
>
> Performance counter stats for 'drmgr -c phb -r -s PHB 23' (5 runs):
>
> - 1,473.75 msec task-clock # 0.092
> CPUs utilized ( +- 14.15% )
> - 2,652 cs # 3.000
> K/sec ( +- 14.16% )
> + 1,444.55 msec task-clock # 0.221
> CPUs utilized ( +- 14.14% )
> + 104 cs # 119.957
> /sec ( +- 14.63% )
>
> - 15.99718 +- 0.00801 seconds time elapsed ( +- 0.05% )
> + 6.54256 +- 0.00830 seconds time elapsed ( +- 0.13% )
>
> Move the existing rtas_lock-guarded critical section in sys_rtas()
> into a conventional rtas_busy_delay()-based loop, returning to user
> space only when a final success or failure result is available.
>
> Signed-off-by: Nathan Lynch <nathanl at linux.ibm.com>
Should there be some kind of timeout? I'm a bit worried by sleeping in
a syscall for an extended period.
--
Andrew Donnellan OzLabs, ADL Canberra
ajd at linux.ibm.com IBM Australia Limited
More information about the Linuxppc-dev
mailing list