[PATCH v5 00/10] Support new pmem flush and sync instructions for POWER
Aneesh Kumar K.V
aneesh.kumar at linux.ibm.com
Fri Jun 19 23:10:07 AEST 2020
"Aneesh Kumar K.V" <aneesh.kumar at linux.ibm.com> writes:
> This patch series enables the usage os new pmem flush and sync instructions on POWER
> architecture. POWER10 introduces two new variants of dcbf instructions (dcbstps and dcbfps)
> that can be used to write modified locations back to persistent storage. Additionally,
> POWER10 also introduce phwsync and plwsync which can be used to establish order of these
> writes to persistent storage.
>
> This series exposes these instructions to the rest of the kernel. The existing
> dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate
> synchronization with OpenCAPI-hosted persistent storage. Hence the new instructions
> are added as a variant of the old ones that old hardware won't differentiate.
>
> On POWER10, pmem devices will be represented by a different device tree compat
> strings. This ensures that older kernels won't initialize pmem devices on POWER10.
>
> W.r.t userspace we want to make sure applications are enabled to use MAP_SYNC only
> if they are using the new instructions. To avoid the wrong usage of MAP_SYNC on
> newer hardware, we disable MAP_SYNC by default on newer hardware. The namespace specific
> attribute /sys/block/pmem0/dax/sync_fault can be used to enable MAP_SYNC later.
>
> With this:
> 1) vPMEM continues to work since it is a volatile region. That
> doesn't need any flush instructions.
>
> 2) pmdk and other user applications get updated to use new instructions
> and updated packages are made available to all distributions
>
> 3) On newer hardware, the device will appear with a new compat string.
> Hence older distributions won't initialize pmem on newer hardware.
>
> 4) If we have a newer kernel with an older distro, we use the per
> namespace sysfs knob that prevents the usage of MAP_SYNC.
>
> 5) Sometime in the future, we mark the CONFIG_ARCH_MAP_SYNC_DISABLE=n
> on ppc64 when we are confident that everybody is using the new flush
> instruction.
>
> Chaanges from V4:
> * Add namespace specific sychronous fault control.
>
> Changes from V3:
> * Add new compat string to be used for the device.
> * Use arch_pmem_flush_barrier() in dm-writecache.
>
> Aneesh Kumar K.V (10):
> powerpc/pmem: Restrict papr_scm to P8 and above.
> powerpc/pmem: Add new instructions for persistent storage and sync
> powerpc/pmem: Add flush routines using new pmem store and sync
> instruction
> libnvdimm/nvdimm/flush: Allow architecture to override the flush
> barrier
> powerpc/pmem/of_pmem: Update of_pmem to use the new barrier
> instruction.
> powerpc/pmem: Avoid the barrier in flush routines
> powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem
> flush functions.
> libnvdimm/dax: Add a dax flag to control synchronous fault support
> powerpc/pmem: Disable synchronous fault by default
> powerpc/pmem: Initialize pmem device on newer hardware
>
> arch/powerpc/include/asm/cacheflush.h | 10 ++++
> arch/powerpc/include/asm/ppc-opcode.h | 12 ++++
> arch/powerpc/lib/pmem.c | 46 ++++++++++++--
> arch/powerpc/platforms/Kconfig.cputype | 9 +++
> arch/powerpc/platforms/pseries/papr_scm.c | 31 +++++++++-
> arch/powerpc/platforms/pseries/pmem.c | 6 ++
> drivers/dax/bus.c | 2 +-
> drivers/dax/super.c | 73 +++++++++++++++++++++++
> drivers/md/dm-writecache.c | 2 +-
> drivers/nvdimm/of_pmem.c | 8 +++
> drivers/nvdimm/pmem.c | 4 ++
> drivers/nvdimm/region_devs.c | 24 ++++++--
> include/linux/dax.h | 16 +++++
> include/linux/libnvdimm.h | 8 +++
> mm/Kconfig | 3 +
> 15 files changed, 243 insertions(+), 11 deletions(-)
Ping.
Are we good with the approach here?
-aneesh
More information about the Linuxppc-dev
mailing list