[PATCH v5 00/10] Support new pmem flush and sync instructions for POWER
Aneesh Kumar K.V
aneesh.kumar at linux.ibm.com
Wed Jun 10 16:23:33 AEST 2020
This patch series enables the usage os new pmem flush and sync instructions on POWER
architecture. POWER10 introduces two new variants of dcbf instructions (dcbstps and dcbfps)
that can be used to write modified locations back to persistent storage. Additionally,
POWER10 also introduce phwsync and plwsync which can be used to establish order of these
writes to persistent storage.
This series exposes these instructions to the rest of the kernel. The existing
dcbf and hwsync instructions in P8 and P9 are adequate to enable appropriate
synchronization with OpenCAPI-hosted persistent storage. Hence the new instructions
are added as a variant of the old ones that old hardware won't differentiate.
On POWER10, pmem devices will be represented by a different device tree compat
strings. This ensures that older kernels won't initialize pmem devices on POWER10.
W.r.t userspace we want to make sure applications are enabled to use MAP_SYNC only
if they are using the new instructions. To avoid the wrong usage of MAP_SYNC on
newer hardware, we disable MAP_SYNC by default on newer hardware. The namespace specific
attribute /sys/block/pmem0/dax/sync_fault can be used to enable MAP_SYNC later.
With this:
1) vPMEM continues to work since it is a volatile region. That
doesn't need any flush instructions.
2) pmdk and other user applications get updated to use new instructions
and updated packages are made available to all distributions
3) On newer hardware, the device will appear with a new compat string.
Hence older distributions won't initialize pmem on newer hardware.
4) If we have a newer kernel with an older distro, we use the per
namespace sysfs knob that prevents the usage of MAP_SYNC.
5) Sometime in the future, we mark the CONFIG_ARCH_MAP_SYNC_DISABLE=n
on ppc64 when we are confident that everybody is using the new flush
instruction.
Chaanges from V4:
* Add namespace specific sychronous fault control.
Changes from V3:
* Add new compat string to be used for the device.
* Use arch_pmem_flush_barrier() in dm-writecache.
Aneesh Kumar K.V (10):
powerpc/pmem: Restrict papr_scm to P8 and above.
powerpc/pmem: Add new instructions for persistent storage and sync
powerpc/pmem: Add flush routines using new pmem store and sync
instruction
libnvdimm/nvdimm/flush: Allow architecture to override the flush
barrier
powerpc/pmem/of_pmem: Update of_pmem to use the new barrier
instruction.
powerpc/pmem: Avoid the barrier in flush routines
powerpc/book3s/pmem: Add WARN_ONCE to catch the wrong usage of pmem
flush functions.
libnvdimm/dax: Add a dax flag to control synchronous fault support
powerpc/pmem: Disable synchronous fault by default
powerpc/pmem: Initialize pmem device on newer hardware
arch/powerpc/include/asm/cacheflush.h | 10 ++++
arch/powerpc/include/asm/ppc-opcode.h | 12 ++++
arch/powerpc/lib/pmem.c | 46 ++++++++++++--
arch/powerpc/platforms/Kconfig.cputype | 9 +++
arch/powerpc/platforms/pseries/papr_scm.c | 31 +++++++++-
arch/powerpc/platforms/pseries/pmem.c | 6 ++
drivers/dax/bus.c | 2 +-
drivers/dax/super.c | 73 +++++++++++++++++++++++
drivers/md/dm-writecache.c | 2 +-
drivers/nvdimm/of_pmem.c | 8 +++
drivers/nvdimm/pmem.c | 4 ++
drivers/nvdimm/region_devs.c | 24 ++++++--
include/linux/dax.h | 16 +++++
include/linux/libnvdimm.h | 8 +++
mm/Kconfig | 3 +
15 files changed, 243 insertions(+), 11 deletions(-)
--
2.26.2
More information about the Linuxppc-dev
mailing list