[PATCH v2 3/5] libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Tue May 19 15:30:26 AEST 2020


Hi Dan,

Apologies for the delay in response. I was waiting for feedback from
hardware team before responding to this email.


Dan Williams <dan.j.williams at intel.com> writes:

> On Tue, May 12, 2020 at 8:47 PM Aneesh Kumar K.V
> <aneesh.kumar at linux.ibm.com> wrote:
>>
>> Architectures like ppc64 provide persistent memory specific barriers
>> that will ensure that all stores for which the modifications are
>> written to persistent storage by preceding dcbfps and dcbstps
>> instructions have updated persistent storage before any data
>> access or data transfer caused by subsequent instructions is initiated.
>> This is in addition to the ordering done by wmb()
>>
>> Update nvdimm core such that architecture can use barriers other than
>> wmb to ensure all previous writes are architecturally visible for
>> the platform buffer flush.
>
> This seems like an exceedingly bad idea, maybe I'm missing something.
> This implies that the deployed base of DAX applications using the old
> instruction sequence are going to regress on new hardware that
> requires the new instructions to be deployed.


pmdk support for ppc64 is still work in progress and there is pull
request to switch pmdk to use new instruction.

https://github.com/tuliom/pmdk/commit/fix-flush

All userspace applications will be switched to use the new
instructions. The new instructions are designed such that when running on P8
and P9 they behave as 'dcbf' and 'hwsync'.

Applications using new instructions will behave as expected when running
on P8 and P9. Only future hardware will differentiate between 'dcbf' and
'dcbfps' 

> I'm thinking the kernel
> should go as far as to disable DAX operation by default on new
> hardware until userspace asserts that it is prepared to switch to the
> new implementation. Is there any other way to ensure the forward
> compatibility of deployed ppc64 DAX applications?

AFAIU there is no released persistent memory hardware on ppc64 platform
and we need to make sure before applications get enabled to use these
persistent memory devices, they should switch to use the new
instruction?


>
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
>> ---
>>  drivers/nvdimm/region_devs.c | 8 ++++----
>>  include/linux/libnvdimm.h    | 4 ++++
>>  2 files changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
>> index ccbb5b43b8b2..88ea34a9c7fd 100644
>> --- a/drivers/nvdimm/region_devs.c
>> +++ b/drivers/nvdimm/region_devs.c
>> @@ -1216,13 +1216,13 @@ int generic_nvdimm_flush(struct nd_region *nd_region)
>>         idx = this_cpu_add_return(flush_idx, hash_32(current->pid + idx, 8));
>>
>>         /*
>> -        * The first wmb() is needed to 'sfence' all previous writes
>> -        * such that they are architecturally visible for the platform
>> -        * buffer flush.  Note that we've already arranged for pmem
>> +        * The first arch_pmem_flush_barrier() is needed to 'sfence' all
>> +        * previous writes such that they are architecturally visible for
>> +        * the platform buffer flush. Note that we've already arranged for pmem
>>          * writes to avoid the cache via memcpy_flushcache().  The final
>>          * wmb() ensures ordering for the NVDIMM flush write.
>>          */
>> -       wmb();
>> +       arch_pmem_flush_barrier();
>>         for (i = 0; i < nd_region->ndr_mappings; i++)
>>                 if (ndrd_get_flush_wpq(ndrd, i, 0))
>>                         writeq(1, ndrd_get_flush_wpq(ndrd, i, idx));
>> diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
>> index 18da4059be09..66f6c65bd789 100644
>> --- a/include/linux/libnvdimm.h
>> +++ b/include/linux/libnvdimm.h
>> @@ -286,4 +286,8 @@ static inline void arch_invalidate_pmem(void *addr, size_t size)
>>  }
>>  #endif
>>
>> +#ifndef arch_pmem_flush_barrier
>> +#define arch_pmem_flush_barrier() wmb()
>> +#endif
>> +
>>  #endif /* __LIBNVDIMM_H__ */
>> --
>> 2.26.2
>>
> _______________________________________________
> Linux-nvdimm mailing list -- linux-nvdimm at lists.01.org
> To unsubscribe send an email to linux-nvdimm-leave at lists.01.org


More information about the Linuxppc-dev mailing list