[PATCHv3 2/2] pseries/scm: buffer pmem's bound addr in dt for kexec kernel

Pingfan Liu kernelfans at gmail.com
Mon Mar 16 13:49:37 AEDT 2020


Appreciate for your kind review. And I have some comment as below.

On Fri, Mar 13, 2020 at 11:18 AM Oliver O'Halloran <oohall at gmail.com> wrote:
>
> On Wed, Mar 4, 2020 at 7:50 PM Pingfan Liu <kernelfans at gmail.com> wrote:
> >
> > At present, plpar_hcall(H_SCM_BIND_MEM, ...) takes a very long time, so
> > if dumping to fsdax, it will take a very long time.
> >
> > Take a closer look, during the papr_scm initialization, the only
> > configuration is through drc_pmem_bind()-> plpar_hcall(H_SCM_BIND_MEM,
> > ...), which helps to set up the bound address.
> >
> > On pseries, for kexec -l/-p kernel, there is no reset of hardware, and this
> > step can be stepped around to save times.  So the pmem bound address can be
> > passed to the 2nd kernel through a dynamic added property "bound-addr" in
> > dt node 'ibm,pmemory'.
> >
> > Signed-off-by: Pingfan Liu <kernelfans at gmail.com>
> > To: linuxppc-dev at lists.ozlabs.org
> > Cc: Benjamin Herrenschmidt <benh at kernel.crashing.org>
> > Cc: Paul Mackerras <paulus at samba.org>
> > Cc: Michael Ellerman <mpe at ellerman.id.au>
> > Cc: Hari Bathini <hbathini at linux.ibm.com>
> > Cc: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
> > Cc: Oliver O'Halloran <oohall at gmail.com>
> > Cc: Dan Williams <dan.j.williams at intel.com>
> > Cc: Andrew Donnellan <ajd at linux.ibm.com>
> > Cc: Christophe Leroy <christophe.leroy at c-s.fr>
> > Cc: Rob Herring <robh+dt at kernel.org>
> > Cc: Frank Rowand <frowand.list at gmail.com>
> > Cc: kexec at lists.infradead.org
> > ---
> > note: This patch has not been tested since I can not get such a pseries with pmem.
> >       Please kindly to give some suggestion, thanks.
>
> There was some qemu patches to implement the Hcall interface floating
> around a while ago. I'm not sure they ever made it into upstream qemu
> though.
Unfortunately, it does not appear in latest qemu code. I think
probably virt-pmem has achieved the same feature.
>
> > ---
> >  arch/powerpc/platforms/pseries/of_helpers.c |  1 +
> >  arch/powerpc/platforms/pseries/papr_scm.c   | 33 ++++++++++++++++++++---------
> >  drivers/of/base.c                           |  1 +
> >  3 files changed, 25 insertions(+), 10 deletions(-)
> >
> > diff --git a/arch/powerpc/platforms/pseries/of_helpers.c b/arch/powerpc/platforms/pseries/of_helpers.c
> > index 1022e0f..2c7bab4 100644
> > --- a/arch/powerpc/platforms/pseries/of_helpers.c
> > +++ b/arch/powerpc/platforms/pseries/of_helpers.c
> > @@ -34,6 +34,7 @@ struct property *new_property(const char *name, const int length,
> >         kfree(new);
> >         return NULL;
> >  }
> > +EXPORT_SYMBOL(new_property);
> >
> >  /**
> >   * pseries_of_derive_parent - basically like dirname(1)
> > diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
> > index 0b4467e..54ae903 100644
> > --- a/arch/powerpc/platforms/pseries/papr_scm.c
> > +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> > @@ -14,6 +14,7 @@
> >  #include <linux/delay.h>
> >
> >  #include <asm/plpar_wrappers.h>
> > +#include "of_helpers.h"
> >
> >  #define BIND_ANY_ADDR (~0ul)
> >
> > @@ -383,7 +384,7 @@ static int papr_scm_probe(struct platform_device *pdev)
> >  {
> >         struct device_node *dn = pdev->dev.of_node;
> >         u32 drc_index, metadata_size;
> > -       u64 blocks, block_size;
> > +       u64 blocks, block_size, bound_addr = 0;
> >         struct papr_scm_priv *p;
> >         const char *uuid_str;
> >         u64 uuid[2];
> > @@ -440,17 +441,29 @@ static int papr_scm_probe(struct platform_device *pdev)
> >         p->metadata_size = metadata_size;
> >         p->pdev = pdev;
> >
> > -       /* request the hypervisor to bind this region to somewhere in memory */
> > -       rc = drc_pmem_bind(p);
> > +       of_property_read_u64(dn, "bound-addr", &bound_addr);
> > +       if (bound_addr) {
> > +               p->bound_addr = bound_addr;
> > +       } else {
> > +               struct property *property;
> > +               u64 big;
> >
> > -       /* If phyp says drc memory still bound then force unbound and retry */
> > -       if (rc == H_OVERLAP)
> > -               rc = drc_pmem_query_n_bind(p);
> > +               /* request the hypervisor to bind this region to somewhere in memory */
> > +               rc = drc_pmem_bind(p);
> >
> > -       if (rc != H_SUCCESS) {
> > -               dev_err(&p->pdev->dev, "bind err: %d\n", rc);
> > -               rc = -ENXIO;
> > -               goto err;
> > +               /* If phyp says drc memory still bound then force unbound and retry */
> > +               if (rc == H_OVERLAP)
> > +                       rc = drc_pmem_query_n_bind(p);
> > +
> > +               if (rc != H_SUCCESS) {
> > +                       dev_err(&p->pdev->dev, "bind err: %d\n", rc);
> > +                       rc = -ENXIO;
> > +                       goto err;
> > +               }
> > +               big = cpu_to_be64(p->bound_addr);
> > +               property = new_property("bound-addr", sizeof(u64), (const unsigned char *)&big,
> > +                       NULL);
>
> That should probably be "linux,bound-addr"
OK, thanks for suggestion.
>
> The other thing that stands out to me is that you aren't removing the
Yes, you are right. I will fix it in V2.
> property when the region is unbound. As a general rule I'd prefer we
> didn't hack the DT at runtime, but if we are going to then we should
> make sure we're not putting anything wrong in there.
Actually, the dynamically building of DT is widely used by "kexec -l".

The pre-condition for the hacked method is that the bound pmem-addr
will not change. And on pseries, during kexec -l/-p, a machine reset
will not be invoked, so the bound address should be changed.

Thanks,
Pingfan
[...]


More information about the Linuxppc-dev mailing list