[PATCH v2 3/5] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Fri May 22 03:02:51 AEST 2020


On 5/21/20 8:08 PM, Jeff Moyer wrote:
> Dan Williams <dan.j.williams at intel.com> writes:
> 
>>> But I agree with your concern that if we have older kernel/applications
>>> that continue to use `dcbf` on future hardware we will end up
>>> having issues w.r.t powerfail consistency. The plan is what you outlined
>>> above as tighter ecosystem control. Considering we don't have a pmem
>>> device generally available, we get both kernel and userspace upgraded
>>> to use these new instructions before such a device is made available.
> 
> I thought power already supported NVDIMM-N, no?  So are you saying that
> those devices will continue to work with the existing flushing and
> fencing mechanisms?
> 

yes. these devices can continue to use 'dcbf + hwsync' as long as we are 
running them on P9.


>> Ok, I think a compile time kernel option with a runtime override
>> satisfies my concern. Does that work for you?
> 
> The compile time option only helps when running newer kernels.  I'm not
> sure how you would even begin to audit userspace applications (keep in
> mind, not every application is open source, and not every application
> uses pmdk).  I also question the merits of forcing the administrator to
> make the determination of whether all applications on the system will
> work properly.  Really, you have to rely on the vendor to tell you the
> platform is supported, and at that point, why put further hurdles in the
> way?
> 
> The decision to require different instructions on ppc is unfortunate,
> but one I'm sure we have no control over.  I don't see any merit in the
> kernel disallowing MAP_SYNC access on these platforms.  Ideally, we'd
> have some way of ensuring older kernels don't work with these new
> platforms, but I don't think that's possible.
> 


I am currently looking at the possibility of firmware present these 
devices with different device-tree compat values. So that older 
/existing kernel won't initialize the device on newer systems. Is that a 
good compromise? We still can end up with older userspace and newer 
kernel. One of the option suggested by Jan Kara is to use a prctl flag 
to control that? (intead of kernel parameter option I posted before)


> Moving on to the patch itself--Aneesh, have you audited other persistent
> memory users in the kernel?  For example, drivers/md/dm-writecache.c does
> this:
> 
> static void writecache_commit_flushed(struct dm_writecache *wc, bool wait_for_ios)
> {
>   	if (WC_MODE_PMEM(wc))
> 	        wmb(); <==========
>          else
>                  ssd_commit_flushed(wc, wait_for_ios);
> }
> 
> I believe you'll need to make modifications there.
> 

Correct. Thanks for catching that.


I don't understand dm much, wondering how this will work with 
non-synchronous DAX device?

-aneesh


More information about the Linuxppc-dev mailing list