[PATCH 2/2] powerpc/nvdimm: use H_SCM_QUERY hcall on H_OVERLAP error

Oliver O'Halloran oohall at gmail.com
Thu Aug 29 17:59:56 AEST 2019


On Thu, Aug 29, 2019 at 4:34 PM Aneesh Kumar K.V
<aneesh.kumar at linux.ibm.com> wrote:
>
> Right now we force an unbind of SCM memory at drcindex on H_OVERLAP error.
> This really slows down operations like kexec where we get the H_OVERLAP
> error because we don't go through a full hypervisor re init.

Maybe we should be unbinding it on a kexec().

> H_OVERLAP error for a H_SCM_BIND_MEM hcall indicates that SCM memory at
> drc index is already bound. Since we don't specify a logical memory
> address for bind hcall, we can use the H_SCM_QUERY hcall to query
> the already bound logical address.

This is a little sketchy since we might have crashed during the
initial bind. Checking if the last block is bound to where we expect
it to be might be a good idea. If it's not where we expect it to be,
then an unbind->bind cycle is the only sane thing to do.

> Boot time difference with and without patch is:
>
> [    5.583617] IOMMU table initialized, virtual merging enabled
> [    5.603041] papr_scm ibm,persistent-memory:ibm,pmemory at 44104001: Retrying bind after unbinding
> [  301.514221] papr_scm ibm,persistent-memory:ibm,pmemory at 44108001: Retrying bind after unbinding
> [  340.057238] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 failures), 275 descs

Is the unbind significantly slower than a bind? Or is the region here
just massive?

> after fix
>
> [    5.101572] IOMMU table initialized, virtual merging enabled
> [    5.116984] papr_scm ibm,persistent-memory:ibm,pmemory at 44104001: Querying SCM details
> [    5.117223] papr_scm ibm,persistent-memory:ibm,pmemory at 44108001: Querying SCM details
> [    5.120530] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 failures), 275 descs
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
> ---
>  arch/powerpc/platforms/pseries/papr_scm.c | 26 ++++++++++++++++++++---
>  1 file changed, 23 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> index 220e595cb579..4b74cfe7b334 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -110,6 +110,27 @@ static void drc_pmem_unbind(struct papr_scm_priv *p)
>         return;
>  }
>
> +static int drc_pmem_query(struct papr_scm_priv *p)
> +{
> +       unsigned long ret[PLPAR_HCALL_BUFSIZE];
> +       int64_t rc;
> +
> +
> +       rc = plpar_hcall(H_SCM_QUERY_BLOCK_MEM_BINDING, ret,
> +                        p->drc_index, 0);
> +
> +       if (rc) {
> +               dev_err(&p->pdev->dev, "Failed to bind SCM");
> +               return rc;
> +       }
> +
> +       p->bound_addr = ret[0];
> +       dev_dbg(&p->pdev->dev, "bound drc 0x%x to %pR\n", p->drc_index, &p->res);
> +
> +       return 0;
> +}
> +
> +
>  static int papr_scm_meta_get(struct papr_scm_priv *p,
>                              struct nd_cmd_get_config_data_hdr *hdr)
>  {
> @@ -431,9 +452,8 @@ static int papr_scm_probe(struct platform_device *pdev)
>
>         /* If phyp says drc memory still bound then force unbound and retry */
>         if (rc == H_OVERLAP) {
> -               dev_warn(&pdev->dev, "Retrying bind after unbinding\n");
> -               drc_pmem_unbind(p);
> -               rc = drc_pmem_bind(p);

> +               dev_warn(&pdev->dev, "Querying SCM details\n");

That's a pretty vague message. If we're going to treat leaving the
region bound over kexec() as normal then you might want to bump it
down to pr_info() or so.

> +               rc = drc_pmem_query(p);
>         }
>
>         if (rc != H_SUCCESS) {
> --
> 2.21.0
>


More information about the Linuxppc-dev mailing list