[PATCH] powerpc/fadump: Add timeout to RTAS busy-wait loops
Sourabh Jain
sourabhjain at linux.ibm.com
Mon Apr 13 23:50:22 AEST 2026
Hello Adriano
On 06/04/26 11:45, Adriano Vero wrote:
> The ibm,configure-kernel-dump RTAS call sites in
> rtas_fadump_register(), rtas_fadump_unregister(), and
> rtas_fadump_invalidate() polled indefinitely while firmware returned
> a busy status. A misbehaving or hung firmware could stall these paths
> forever, blocking fadump registration at boot or preventing clean
> teardown.
I agree that it is a good idea to avoid calling rtas_call for
fadump operations indefinitely. However, so far I have not come
across a case where the kernel gets stuck during fadump
registration, unregistration, or invalidation due to phyp/RTAS
continuously returning a wait time on an LPAR.
That said, since fadump support has recently been extended to
QEMU, this change might possibly prove useful in that environment.
>
> Track the accumulated delay in a total_wait counter and bail out with
> -ETIMEDOUT if it reaches RTAS_FADUMP_MAX_WAIT_MS (60 seconds)
What is the rationale behind choosing a 60-second limit?
> before
> firmware signals completion. This follows the bounded busy-wait pattern
> used in rtas-rtc.c.
>
> Signed-off-by: Adriano Vero <litaliano00.contact at gmail.com>
> ---
> arch/powerpc/platforms/pseries/rtas-fadump.c | 37 ++++++++++++++------
> arch/powerpc/platforms/pseries/rtas-fadump.h | 6 ++++
> 2 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/rtas-fadump.c b/arch/powerpc/platforms/pseries/rtas-fadump.c
> index eceb32893..b165f165c 100644
> --- a/arch/powerpc/platforms/pseries/rtas-fadump.c
> +++ b/arch/powerpc/platforms/pseries/rtas-fadump.c
> @@ -181,7 +181,7 @@ static u64 rtas_fadump_get_bootmem_min(void)
>
> static int rtas_fadump_register(struct fw_dump *fadump_conf)
> {
> - unsigned int wait_time, fdm_size;
> + unsigned int wait_time, total_wait, fdm_size;
> int rc, err = -EIO;
>
> /*
> @@ -192,15 +192,20 @@ static int rtas_fadump_register(struct fw_dump *fadump_conf)
> fdm_size = sizeof(struct rtas_fadump_section_header);
> fdm_size += be16_to_cpu(fdm.header.dump_num_sections) * sizeof(struct rtas_fadump_section);
>
> - /* TODO: Add upper time limit for the delay */
> + total_wait = 0;
> do {
> rc = rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1,
> NULL, FADUMP_REGISTER, &fdm, fdm_size);
>
> wait_time = rtas_busy_delay_time(rc);
> - if (wait_time)
> + if (wait_time) {
> + if (total_wait >= RTAS_FADUMP_MAX_WAIT_MS) {
> + pr_err("Timed out waiting for firmware to register fadump\n");
> + return -ETIMEDOUT;
> + }
> + total_wait += wait_time;
> mdelay(wait_time);
> -
> + }
> } while (wait_time);
>
> switch (rc) {
> @@ -234,18 +239,24 @@ static int rtas_fadump_register(struct fw_dump *fadump_conf)
>
> static int rtas_fadump_unregister(struct fw_dump *fadump_conf)
> {
> - unsigned int wait_time;
> + unsigned int wait_time, total_wait;
> int rc;
>
> - /* TODO: Add upper time limit for the delay */
> + total_wait = 0;
> do {
> rc = rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1,
> NULL, FADUMP_UNREGISTER, &fdm,
> sizeof(struct rtas_fadump_mem_struct));
>
> wait_time = rtas_busy_delay_time(rc);
> - if (wait_time)
> + if (wait_time) {
> + if (total_wait >= RTAS_FADUMP_MAX_WAIT_MS) {
> + pr_err("Timed out waiting for firmware to unregister fadump\n");
> + return -ETIMEDOUT;
> + }
> + total_wait += wait_time;
> mdelay(wait_time);
> + }
> } while (wait_time);
>
> if (rc) {
> @@ -259,18 +270,24 @@ static int rtas_fadump_unregister(struct fw_dump *fadump_conf)
>
> static int rtas_fadump_invalidate(struct fw_dump *fadump_conf)
> {
> - unsigned int wait_time;
> + unsigned int wait_time, total_wait;
> int rc;
>
> - /* TODO: Add upper time limit for the delay */
> + total_wait = 0;
> do {
> rc = rtas_call(fadump_conf->ibm_configure_kernel_dump, 3, 1,
> NULL, FADUMP_INVALIDATE, fdm_active,
> sizeof(struct rtas_fadump_mem_struct));
>
> wait_time = rtas_busy_delay_time(rc);
> - if (wait_time)
> + if (wait_time) {
> + if (total_wait >= RTAS_FADUMP_MAX_WAIT_MS) {
> + pr_err("Timed out waiting for firmware to invalidate fadump\n");
> + return -ETIMEDOUT;
> + }
> + total_wait += wait_time;
> mdelay(wait_time);
> + }
> } while (wait_time);
This do...while loop is almost identical in all three places.
Would it make sense to introduce a helper function to wrap the
rtas_call, along with handling the wait time and timeout?
- Sourabh Jain
> if (rc) {
> diff --git a/arch/powerpc/platforms/pseries/rtas-fadump.h b/arch/powerpc/platforms/pseries/rtas-fadump.h
> index c109abf6b..65fdab7b5 100644
> --- a/arch/powerpc/platforms/pseries/rtas-fadump.h
> +++ b/arch/powerpc/platforms/pseries/rtas-fadump.h
> @@ -41,6 +41,12 @@
> #define MAX_SECTIONS 10
> #define RTAS_FADUMP_MAX_BOOT_MEM_REGS 7
>
> +/*
> + * Maximum time to wait for firmware to respond to an
> + * ibm,configure-kernel-dump RTAS call before giving up.
> + */
> +#define RTAS_FADUMP_MAX_WAIT_MS 60000U
> +
> /* Kernel Dump section info */
> struct rtas_fadump_section {
> __be32 request_flag;
More information about the Linuxppc-dev
mailing list