[PATCH v2 3/5] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier

Mikulas Patocka mpatocka at redhat.com
Fri May 22 23:01:17 AEST 2020



On Fri, 22 May 2020, Aneesh Kumar K.V wrote:

> On 5/22/20 3:01 PM, Michal Suchánek wrote:
> > On Thu, May 21, 2020 at 02:52:30PM -0400, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Thu, 21 May 2020, Dan Williams wrote:
> > > 
> > > > On Thu, May 21, 2020 at 10:03 AM Aneesh Kumar K.V
> > > > <aneesh.kumar at linux.ibm.com> wrote:
> > > > > 
> > > > > > Moving on to the patch itself--Aneesh, have you audited other
> > > > > > persistent
> > > > > > memory users in the kernel?  For example, drivers/md/dm-writecache.c
> > > > > > does
> > > > > > this:
> > > > > > 
> > > > > > static void writecache_commit_flushed(struct dm_writecache *wc, bool
> > > > > > wait_for_ios)
> > > > > > {
> > > > > >        if (WC_MODE_PMEM(wc))
> > > > > >                wmb(); <==========
> > > > > >           else
> > > > > >                   ssd_commit_flushed(wc, wait_for_ios);
> > > > > > }
> > > > > > 
> > > > > > I believe you'll need to make modifications there.
> > > > > > 
> > > > > 
> > > > > Correct. Thanks for catching that.
> > > > > 
> > > > > 
> > > > > I don't understand dm much, wondering how this will work with
> > > > > non-synchronous DAX device?
> > > > 
> > > > That's a good point. DM-writecache needs to be cognizant of things
> > > > like virtio-pmem that violate the rule that persisent memory writes
> > > > can be flushed by CPU functions rather than calling back into the
> > > > driver. It seems we need to always make the flush case a dax_operation
> > > > callback to account for this.
> > > 
> > > dm-writecache is normally sitting on the top of dm-linear, so it would
> > > need to pass the wmb() call through the dm core and dm-linear target ...
> > > that would slow it down ... I remember that you already did it this way
> > > some times ago and then removed it.
> > > 
> > > What's the exact problem with POWER? Could the POWER system have two types
> > > of persistent memory that need two different ways of flushing?
> > 
> > As far as I understand the discussion so far
> > 
> >   - on POWER $oldhardware uses $oldinstruction to ensure pmem consistency
> >   - on POWER $newhardware uses $newinstruction to ensure pmem consistency
> >     (compatible with $oldinstruction on $oldhardware)
> 
> Correct.
> 
> >   - on some platforms instead of barrier instruction a callback into the
> >     driver is issued to ensure consistency 
> 
> This is virtio-pmem only at this point IIUC.
> 
> -aneesh

And does the virtio-pmem driver track which pages are dirty? Or does it 
need to specify the range of pages to flush in the flush function?

> > None of this is reflected by the dm driver.

We could make a new dax method:
void *(dax_get_flush_function)(void);

This would return a pointer to "wmb()" on x86 and something else on Power.

The method "dax_get_flush_function" would be called only once when 
initializing the writecache driver (because the call would be slow because 
it would have to go through the DM stack) and then, the returned function 
would be called each time we need write ordering. The returned function 
would do just "sfence; ret".

Mikulas


More information about the Linuxppc-dev mailing list