[PATCH v4 05/13] powerpc/rtas: Facilitate high-level call sequences

Aneesh Kumar K.V (IBM) aneesh.kumar at kernel.org
Mon Nov 20 19:10:22 AEDT 2023


Nathan Lynch via B4 Relay <devnull+nathanl.linux.ibm.com at kernel.org>
writes:

> From: Nathan Lynch <nathanl at linux.ibm.com>
>
> On RTAS platforms there is a general restriction that the OS must not
> enter RTAS on more than one CPU at a time. This low-level
> serialization requirement is satisfied by holding a spin
> lock (rtas_lock) across most RTAS function invocations.
>
> However, some pseries RTAS functions require multiple successive calls
> to complete a logical operation. Beginning a new call sequence for such a
> function may disrupt any other sequences of that function already in
> progress. Safe and reliable use of these functions effectively
> requires higher-level serialization beyond what is already done at the
> level of RTAS entry and exit.
>
> Where a sequence-based RTAS function is invoked only through
> sys_rtas(), with no in-kernel users, there is no issue as far as the
> kernel is concerned. User space is responsible for appropriately
> serializing its call sequences. (Whether user space code actually
> takes measures to prevent sequence interleaving is another matter.)
> Examples of such functions currently include ibm,platform-dump and
> ibm,get-vpd.
>
> But where a sequence-based RTAS function has both user space and
> in-kernel uesrs, there is a hazard. Even if the in-kernel call sites
> of such a function serialize their sequences correctly, a user of
> sys_rtas() can invoke the same function at any time, potentially
> disrupting a sequence in progress.
>
> So in order to prevent disruption of kernel-based RTAS call sequences,
> they must serialize not only with themselves but also with sys_rtas()
> users, somehow. Preferably without adding global locks or adding more
> function-specific hacks to sys_rtas(). This is a prerequisite for
> adding an in-kernel call sequence of ibm,get-vpd, which is in a change
> to follow.
>
> Note that it has never been feasible for the kernel to prevent
> sys_rtas()-based sequences from being disrupted because control
> returns to user space on every call. sys_rtas()-based users of these
> functions have always been, and continue to be, responsible for
> coordinating their call sequences with other users, even those which
> may invoke the RTAS functions through less direct means than
> sys_rtas(). This is an unavoidable consequence of exposing
> sequence-based RTAS functions through sys_rtas().
>
> * Add new rtas_function_lock() and rtas_function_unlock() APIs for use
>   with sequence-based RTAS functions.
>
> * Add an optional per-function mutex to struct rtas_function. When this
>   member is set, kernel-internal callers of the RTAS function are
>   required to guard their call sequences with rtas_function_lock() and
>   rtas_function_unlock(). This requirement will be enforced in a later
>   change, after all affected call sites are updated.
>
> * Populate the lock members of function table entries where
>   serialization of call sequences is known to be necessary, along with
>   justifying commentary.
>
> * In sys_rtas(), acquire the per-function mutex when it is present.
>
> There should be no perceivable change introduced here except that
> concurrent callers of the same RTAS function via sys_rtas() may block
> on a mutex instead of spinning on rtas_lock. Changes to follow will
> add rtas_function_lock()/unlock() pairs to kernel-based call
> sequences.
>

Can you add an example of the last part. I did look at to find 06 to
find the details 

	rtas_function_lock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);

	do {
		fwrc = rtas_call(token, 0, 1, NULL);
	} while (rtas_busy_delay(fwrc));

	rtas_function_unlock(RTAS_FN_IBM_ACTIVATE_FIRMWARE);

Reviewed-by: Aneesh Kumar K.V (IBM) <aneesh.kumar at kernel.org>

>
> Signed-off-by: Nathan Lynch <nathanl at linux.ibm.com>
> ---
>  arch/powerpc/include/asm/rtas.h |   2 +
>  arch/powerpc/kernel/rtas.c      | 101 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 101 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index b73010583a8d..9a20caba6858 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -421,6 +421,8 @@ static inline bool rtas_function_implemented(const rtas_fn_handle_t handle)
>  {
>  	return rtas_function_token(handle) != RTAS_UNKNOWN_SERVICE;
>  }
> +void rtas_function_lock(rtas_fn_handle_t handle);
> +void rtas_function_unlock(rtas_fn_handle_t handle);
>  extern int rtas_token(const char *service);
>  extern int rtas_service_present(const char *service);
>  extern int rtas_call(int token, int, int, int *, ...);
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 1fc0b3fffdd1..52f2242d0c28 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -18,6 +18,7 @@
>  #include <linux/kernel.h>
>  #include <linux/lockdep.h>
>  #include <linux/memblock.h>
> +#include <linux/mutex.h>
>  #include <linux/of.h>
>  #include <linux/of_fdt.h>
>  #include <linux/reboot.h>
> @@ -70,14 +71,34 @@ struct rtas_filter {
>   *                            ppc64le, and we want to keep it that way. It does
>   *                            not make sense for this to be set when @filter
>   *                            is NULL.
> + * @lock: Pointer to an optional dedicated per-function mutex. This
> + *        should be set for functions that require multiple calls in
> + *        sequence to complete a single operation, and such sequences
> + *        will disrupt each other if allowed to interleave. Users of
> + *        this function are required to hold the associated lock for
> + *        the duration of the call sequence. Add an explanatory
> + *        comment to the function table entry if setting this member.
>   */
>  struct rtas_function {
>  	s32 token;
>  	const bool banned_for_syscall_on_le:1;
>  	const char * const name;
>  	const struct rtas_filter *filter;
> +	struct mutex *lock;
>  };
>  
> +/*
> + * Per-function locks. Access these through the
> + * rtas_function_lock/unlock APIs, not directly.
> + */
> +static DEFINE_MUTEX(rtas_ibm_activate_firmware_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_dynamic_sensor_state_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_indices_lock);
> +static DEFINE_MUTEX(rtas_ibm_get_vpd_lock);
> +static DEFINE_MUTEX(rtas_ibm_lpar_perftools_lock);
> +static DEFINE_MUTEX(rtas_ibm_physical_attestation_lock);
> +static DEFINE_MUTEX(rtas_ibm_set_dynamic_indicator_lock);
> +
>  static struct rtas_function rtas_function_table[] __ro_after_init = {
>  	[RTAS_FNIDX__CHECK_EXCEPTION] = {
>  		.name = "check-exception",
> @@ -125,6 +146,13 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = -1, .size_idx1 = -1,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR doesn't explicitly impose any restriction, but
> +		 * this typically requires multiple calls before
> +		 * success, and there's no reason to allow sequences
> +		 * to interleave.
> +		 */
> +		.lock = &rtas_ibm_activate_firmware_lock,
>  	},
>  	[RTAS_FNIDX__IBM_CBE_START_PTCAL] = {
>  		.name = "ibm,cbe-start-ptcal",
> @@ -196,6 +224,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 1, .size_idx1 = -1,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.19–3 is explicit that the OS must not
> +		 * call ibm,get-dynamic-sensor-state with different
> +		 * inputs until a non-retry status has been returned.
> +		 */
> +		.lock = &rtas_ibm_get_dynamic_sensor_state_lock,
>  	},
>  	[RTAS_FNIDX__IBM_GET_INDICES] = {
>  		.name = "ibm,get-indices",
> @@ -203,6 +237,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 2, .size_idx1 = 3,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.17–2 says that the OS must not
> +		 * interleave ibm,get-indices call sequences with
> +		 * different inputs.
> +		 */
> +		.lock = &rtas_ibm_get_indices_lock,
>  	},
>  	[RTAS_FNIDX__IBM_GET_RIO_TOPOLOGY] = {
>  		.name = "ibm,get-rio-topology",
> @@ -220,6 +260,11 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 0, .size_idx1 = -1,
>  			.buf_idx2 = 1, .size_idx2 = 2,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.20–4 indicates that sequences
> +		 * should not be allowed to interleave.
> +		 */
> +		.lock = &rtas_ibm_get_vpd_lock,
>  	},
>  	[RTAS_FNIDX__IBM_GET_XIVE] = {
>  		.name = "ibm,get-xive",
> @@ -239,6 +284,11 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 2, .size_idx1 = 3,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.26–6 says the OS should allow only one
> +		 * call sequence in progress at a time.
> +		 */
> +		.lock = &rtas_ibm_lpar_perftools_lock,
>  	},
>  	[RTAS_FNIDX__IBM_MANAGE_FLASH_IMAGE] = {
>  		.name = "ibm,manage-flash-image",
> @@ -277,6 +327,14 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 0, .size_idx1 = 1,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * This follows a sequence-based pattern similar to
> +		 * ibm,get-vpd et al. Since PAPR+ restricts
> +		 * interleaving call sequences for other functions of
> +		 * this style, assume the restriction applies here,
> +		 * even though it's not explicit in the spec.
> +		 */
> +		.lock = &rtas_ibm_physical_attestation_lock,
>  	},
>  	[RTAS_FNIDX__IBM_PLATFORM_DUMP] = {
>  		.name = "ibm,platform-dump",
> @@ -284,6 +342,13 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 4, .size_idx1 = 5,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ 7.3.3.4.1 indicates that concurrent sequences
> +		 * of ibm,platform-dump are allowed if they are
> +		 * operating on different dump tags. So leave
> +		 * the lock pointer unset for now. This may need
> +		 * reconsideration if kernel-internal users appear.
> +		 */
>  	},
>  	[RTAS_FNIDX__IBM_POWER_OFF_UPS] = {
>  		.name = "ibm,power-off-ups",
> @@ -326,6 +391,12 @@ static struct rtas_function rtas_function_table[] __ro_after_init = {
>  			.buf_idx1 = 2, .size_idx1 = -1,
>  			.buf_idx2 = -1, .size_idx2 = -1,
>  		},
> +		/*
> +		 * PAPR+ R1–7.3.18–3 says the OS must not call this
> +		 * function with different inputs until a non-retry
> +		 * status has been returned.
> +		 */
> +		.lock = &rtas_ibm_set_dynamic_indicator_lock,
>  	},
>  	[RTAS_FNIDX__IBM_SET_EEH_OPTION] = {
>  		.name = "ibm,set-eeh-option",
> @@ -556,9 +627,9 @@ static int __init rtas_token_to_function_xarray_init(void)
>  }
>  arch_initcall(rtas_token_to_function_xarray_init);
>  
> -static const struct rtas_function *rtas_token_to_function(s32 token)
> +static struct rtas_function *rtas_token_to_function(s32 token)
>  {
> -	const struct rtas_function *func;
> +	struct rtas_function *func;
>  
>  	if (WARN_ONCE(token < 0, "invalid token %d", token))
>  		return NULL;
> @@ -581,6 +652,28 @@ static const struct rtas_function *rtas_token_to_function(s32 token)
>  	return NULL;
>  }
>  
> +static void __rtas_function_lock(struct rtas_function *func)
> +{
> +	if (func && func->lock)
> +		mutex_lock(func->lock);
> +}
> +
> +static void __rtas_function_unlock(struct rtas_function *func)
> +{
> +	if (func && func->lock)
> +		mutex_unlock(func->lock);
> +}
> +
> +void rtas_function_lock(const rtas_fn_handle_t handle)
> +{
> +	__rtas_function_lock(rtas_function_lookup(handle));
> +}
> +
> +void rtas_function_unlock(const rtas_fn_handle_t handle)
> +{
> +	__rtas_function_unlock(rtas_function_lookup(handle));
> +}
> +
>  /* This is here deliberately so it's only used in this file */
>  void enter_rtas(unsigned long);
>  
> @@ -1885,6 +1978,8 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
>  
>  	buff_copy = get_errorlog_buffer();
>  
> +	__rtas_function_lock(rtas_token_to_function(token));
> +
>  	raw_spin_lock_irqsave(&rtas_lock, flags);
>  	cookie = lockdep_pin_lock(&rtas_lock);
>  
> @@ -1900,6 +1995,8 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
>  	lockdep_unpin_lock(&rtas_lock, cookie);
>  	raw_spin_unlock_irqrestore(&rtas_lock, flags);
>  
> +	__rtas_function_unlock(rtas_token_to_function(token));
> +
>  	if (buff_copy) {
>  		if (errbuf)
>  			log_error(errbuf, ERR_TYPE_RTAS_LOG, 0);
>
> -- 
> 2.41.0


More information about the Linuxppc-dev mailing list