[PATCH v6 6/8] powerpc/pmem: Avoid the barrier in flush routines

Dan Williams dan.j.williams at intel.com
Wed Jul 1 15:08:48 AEST 2020


On Tue, Jun 30, 2020 at 8:09 PM Aneesh Kumar K.V
<aneesh.kumar at linux.ibm.com> wrote:
>
> On 7/1/20 1:15 AM, Dan Williams wrote:
> > On Tue, Jun 30, 2020 at 2:21 AM Aneesh Kumar K.V
> > <aneesh.kumar at linux.ibm.com> wrote:
> > [..]
> >>>> The bio argument isn't for range based flushing, it is for flush
> >>>> operations that need to complete asynchronously.
> >>> How does the block layer determine that the pmem device needs
> >>> asynchronous fushing?
> >>>
> >>
> >>          set_bit(ND_REGION_ASYNC, &ndr_desc.flags);
> >>
> >> and dax_synchronous(dev)
> >
> > Yes, but I think it is overkill to have an indirect function call just
> > for a single instruction.
> >
> > How about something like this instead, to share a common pmem_wmb()
> > across x86 and powerpc.
> >
> > diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
> > index 20ff30c2ab93..b14009060c83 100644
> > --- a/drivers/nvdimm/region_devs.c
> > +++ b/drivers/nvdimm/region_devs.c
> > @@ -1180,6 +1180,13 @@ int nvdimm_flush(struct nd_region *nd_region,
> > struct bio *bio)
> >   {
> >          int rc = 0;
> >
> > +       /*
> > +        * pmem_wmb() is needed to 'sfence' all previous writes such
> > +        * that they are architecturally visible for the platform buffer
> > +        * flush.
> > +        */
> > +       pmem_wmb();
> > +
> >          if (!nd_region->flush)
> >                  rc = generic_nvdimm_flush(nd_region);
> >          else {
> > @@ -1206,17 +1213,14 @@ int generic_nvdimm_flush(struct nd_region *nd_region)
> >          idx = this_cpu_add_return(flush_idx, hash_32(current->pid + idx, 8));
> >
> >          /*
> > -        * The first wmb() is needed to 'sfence' all previous writes
> > -        * such that they are architecturally visible for the platform
> > -        * buffer flush.  Note that we've already arranged for pmem
> > -        * writes to avoid the cache via memcpy_flushcache().  The final
> > -        * wmb() ensures ordering for the NVDIMM flush write.
> > +        * Note that we've already arranged for pmem writes to avoid the
> > +        * cache via memcpy_flushcache().  The final wmb() ensures
> > +        * ordering for the NVDIMM flush write.
> >           */
> > -       wmb();
>
>
> The series already convert this to pmem_wmb().
>
> >          for (i = 0; i < nd_region->ndr_mappings; i++)
> >                  if (ndrd_get_flush_wpq(ndrd, i, 0))
> >                          writeq(1, ndrd_get_flush_wpq(ndrd, i, idx));
> > -       wmb();
> > +       pmem_wmb();
>
>
> Should this be pmem_wmb()? This is ordering the above writeq() right?

Correct, this can just be wmb().

>
> >
> >          return 0;
> >   }
> >
>
> This still results in two pmem_wmb() on platforms that doesn't have
> flush_wpq. I was trying to avoid that by adding a nd_region->flush call
> back.

How about skip or exit early out of generic_nvdimm_flush if
ndrd->flush_wpq is NULL? That still saves an indirect branch at the
cost of another conditional, but that should still be worth it.


More information about the Linuxppc-dev mailing list