[PATCH v5 00/10] Support new pmem flush and sync instructions for POWER

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Wed Jun 10 16:23:33 AEST 2020


This patch series enables the usage os new pmem flush and sync instructions on POWER
architecture. POWER10 introduces two new variants of dcbf instructions (dcbstps and dcbfps)
that can be used to write modified locations back to persistent storage. Additionally,
POWER10 also introduce phwsync and plwsync which can be used to establish order of these
writes to persistent storage.
    
This series exposes these instructions to the rest of the kernel. The existing
dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate
synchronization with OpenCAPI-hosted persistent storage. Hence the new instructions
are added as a variant of the old ones that old hardware won't differentiate.

On POWER10, pmem devices will be represented by a different device tree compat
strings. This ensures that older kernels won't initialize pmem devices on POWER10.

W.r.t userspace we want to make sure applications are enabled to use MAP_SYNC only
if they are using the new instructions. To avoid the wrong usage of MAP_SYNC on
newer hardware, we disable MAP_SYNC by default on newer hardware. The namespace specific
attribute /sys/block/pmem0/dax/sync_fault can be used to enable MAP_SYNC later.

With this:
1) vPMEM continues to work since it is a volatile region. That 
doesn't need any flush instructions.

2) pmdk and other user applications get updated to use new instructions
and updated packages are made available to all distributions

3) On newer hardware, the device will appear with a new compat string. 
Hence older distributions won't initialize pmem on newer hardware.

4) If we have a newer kernel with an older distro, we use the per 
namespace sysfs knob that prevents the usage of MAP_SYNC.

5) Sometime in the future, we mark the CONFIG_ARCH_MAP_SYNC_DISABLE=n
on ppc64 when we are confident that everybody is using the new flush 
instruction.

Chaanges from V4:
* Add namespace specific sychronous fault control.

Changes from V3:
* Add new compat string to be used for the device.
* Use arch_pmem_flush_barrier() in dm-writecache.

Aneesh Kumar K.V (10):
  powerpc/pmem: Restrict papr_scm to P8 and above.
  powerpc/pmem: Add new instructions for persistent storage and sync
  powerpc/pmem: Add flush routines using new pmem store and sync
    instruction
  libnvdimm/nvdimm/flush: Allow architecture to override the flush
    barrier
  powerpc/pmem/of_pmem: Update of_pmem to use the new barrier
    instruction.
  powerpc/pmem: Avoid the barrier in flush routines
  powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem
    flush functions.
  libnvdimm/dax: Add a dax flag to control synchronous fault support
  powerpc/pmem: Disable synchronous fault by default
  powerpc/pmem: Initialize pmem device on newer hardware

 arch/powerpc/include/asm/cacheflush.h     | 10 ++++
 arch/powerpc/include/asm/ppc-opcode.h     | 12 ++++
 arch/powerpc/lib/pmem.c                   | 46 ++++++++++++--
 arch/powerpc/platforms/Kconfig.cputype    |  9 +++
 arch/powerpc/platforms/pseries/papr_scm.c | 31 +++++++++-
 arch/powerpc/platforms/pseries/pmem.c     |  6 ++
 drivers/dax/bus.c                         |  2 +-
 drivers/dax/super.c                       | 73 +++++++++++++++++++++++
 drivers/md/dm-writecache.c                |  2 +-
 drivers/nvdimm/of_pmem.c                  |  8 +++
 drivers/nvdimm/pmem.c                     |  4 ++
 drivers/nvdimm/region_devs.c              | 24 ++++++--
 include/linux/dax.h                       | 16 +++++
 include/linux/libnvdimm.h                 |  8 +++
 mm/Kconfig                                |  3 +
 15 files changed, 243 insertions(+), 11 deletions(-)

-- 
2.26.2



More information about the Linuxppc-dev mailing list