[PATCH 2/2] powerpc/papr_scm: Force a scm-unbind if initial scm-bind fails
Oliver O'Halloran
oohall at gmail.com
Tue Jun 25 11:23:29 AEST 2019
On Tue, Jun 25, 2019 at 12:59 AM Vaibhav Jain <vaibhav at linux.ibm.com> wrote:
>
> In some cases initial bind of scm memory for an lpar can fail if
> previously it wasn't released using a scm-unbind hcall. This situation
> can arise due to panic of the previous kernel or forced lpar reset. In
> such cases the H_SCM_BIND_MEM return a H_OVERLAP error.
What is a forced lpar reset? fadump?
> To mitigate such cases the patch updates drc_pmem_bind() to force a
> call to drc_pmem_unbind() in case the initial bind of scm memory fails
> with H_OVERLAP error. In case scm-bind operation again fails after the
> forced scm-unbind then we follow the existing error path.
>
> Signed-off-by: Vaibhav Jain <vaibhav at linux.ibm.com>
> ---
> arch/powerpc/platforms/pseries/papr_scm.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> index d790e4e4ffb3..049d7927c0a4 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -44,19 +44,26 @@ struct papr_scm_priv {
> struct nd_interleave_set nd_set;
> };
> +/* Forward declaration */
pointless comment.
> +static int drc_pmem_unbind(struct papr_scm_priv *);
>
> static int drc_pmem_bind(struct papr_scm_priv *p)
> {
> unsigned long ret[PLPAR_HCALL_BUFSIZE];
> uint64_t rc, token;
> - uint64_t saved = 0;
> + uint64_t saved;
> + bool tried_unbind = false;
nit: kernel style uses reverse christmas tree declarations, so this should be:
unsigned long ret[PLPAR_HCALL_BUFSIZE];
bool tried_unbind = false;
uint64_t rc, token;
uint64_t saved;
Come to think of it rc should probably be signed since the hcall
return codes are negative. I'm surprised that's not causing a warning
since we use %lld to print rc rather than %llu.
> + dev_dbg(&p->pdev->dev, "bind drc %x\n", p->drc_index);
> /*
> * When the hypervisor cannot map all the requested memory in a single
> * hcall it returns H_BUSY and we call again with the token until
> * we get H_SUCCESS. Aborting the retry loop before getting H_SUCCESS
> * leave the system in an undefined state, so we wait.
> */
> +retry:
> token = 0;
> + saved = 0;
>
> do {
> rc = plpar_hcall(H_SCM_BIND_MEM, ret, p->drc_index, 0,
> @@ -68,8 +75,18 @@ static int drc_pmem_bind(struct papr_scm_priv *p)
> } while (rc == H_BUSY);
>
> if (rc) {
> - dev_err(&p->pdev->dev, "bind err: %lld\n", rc);
> - return -ENXIO;
> + /* retry after unbinding */
> + if (rc == H_OVERLAP && !tried_unbind) {
> + dev_warn(&p->pdev->dev, "Un-binding and retrying\n");
> + /* Try unbind and ignore any errors */
> + tried_unbind = true;
> + drc_pmem_unbind(p);
> + goto retry;
I think It'd be cleaner if we put the unbind-and-retry logic into the
probe function, e.g:
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c
b/arch/powerpc/platforms/pseries/papr_scm.c
index 96c53b23e58f..d113779fc27c 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -316,6 +316,14 @@ static int papr_scm_probe(struct platform_device *pdev)
/* request the hypervisor to bind this region to somewhere in memory */
rc = drc_pmem_bind(p);
+ if (rc == -EBUSY) {
+ /*
+ * If we kexec()ed the previous kernel might have left the DRC
+ * bound in memory. Unbind it and try again.
+ */
+ drc_pmem_unbind(p);
+ rc = drc_pmem_bind(p);
+ }
if (rc)
goto err;
Up to you though.
More information about the Linuxppc-dev
mailing list