[DOC][PATCH v5 1/4] powerpc: Document some HCalls for Storage Class Memory

Vaibhav Jain vaibhav at linux.ibm.com
Wed Jul 24 02:13:54 AEST 2019

This doc patch provides an initial description of the HCall op-codes
that are used by Linux kernel running as a guest operating
system (LPAR) on top of PowerVM or any other sPAPR compliant
hyper-visor (e.g qemu).

Apart from documenting the HCalls the doc-patch also provides a
rudimentary overview of how Hcalls are implemented inside the Linux
kernel and how information flows between kernel and PowerVM/KVM.

Signed-off-by: Vaibhav Jain <vaibhav at linux.ibm.com>

* First patch in this patchset.
 Documentation/powerpc/hcalls.txt | 140 +++++++++++++++++++++++++++++++
 1 file changed, 140 insertions(+)
 create mode 100644 Documentation/powerpc/hcalls.txt

diff --git a/Documentation/powerpc/hcalls.txt b/Documentation/powerpc/hcalls.txt
new file mode 100644
index 000000000000..cc9dd872cecd
--- /dev/null
+++ b/Documentation/powerpc/hcalls.txt
@@ -0,0 +1,140 @@
+Hyper-visor Call Op-codes (HCALLS)
+Virtualization on PPC64 arch is based on the PAPR specification[1] which
+describes run-time environment for a guest operating system and how it should
+interact with the hyper-visor for privileged operations. Currently there are two
+PAPR compliant hypervisors (PHYP):
+IBM PowerVM: IBM's proprietary hyper-visor that supports AIX, IBM-i and Linux as
+	     supported guests (termed as Logical Partitions or LPARS).
+Qemu/KVM:    Supports PPC64 linux guests running on a PPC64 linux host.
+On PPC64 arch a virtualized guest kernel runs in a non-privileged mode (HV=0).
+Hence to perform a privileged operations the guest issues a Hyper-visor
+Call (HCALL) with necessary input operands. PHYP after performing the privilege
+operation returns a status code and output operands back to the guest.
+The ABI specification for a HCall between guest os kernel and PHYP is
+described in [1]. The Opcode for Hcall is set in R3 and subsequent in-arguments
+for the Hcall are provided in registers R4-R12. On return from 'HVCS'
+instruction the status code of HCall is available in R3 an the output parameters
+are returned in registers R4-R12.
+Powerpc arch code provides convenient wrappers named plpar_hcall_xxx defined in
+header 'hvcall.h' to issue HCalls from the linux kernel running as guest.
+DRC & DRC Indexes
+		 PAPR		     Guest
+  DR1          Hypervisor             OS
+  +--+        +----------+         +---------+
+  |  |<------>|          |         |  User   |
+  +--+  DRC1  |          |   DRC   |  Space  |
+	      |          |  Index  +---------+
+  DR2         |          |         |         |
+  +--+        |          |<------->|  Kernel |
+  |  |<----- >|          |  HCall  |         |
+  +--+  DRC2  +----------+         +---------+
+PHYP terms shared hardware resources like PCI devices, NVDimms etc available for
+use by LPARs as Dynamic Resource (DR). When a DR is allocated to an LPAR, PHYP
+creates a data-structure called Dynamic Resource Connector (DRC) to manage LPAR
+access. An LPAR refers to a DRC via an opaque 32-bit number called DRC-Index.
+The DRC-index value is provided to the LPAR via device-tree where its present
+as an attribute in the device tree node associated with the DR.
+HCALL Op-codes
+Below is a partial of of HCALLs that are supported by PHYP. For the
+corresponding opcode values please look into the header
+'arch/powerpc/include/asm/hvcall.h' :
+  Input: drcIndex, offset, buffer-address, numBytesToRead
+  Out: None
+  Description:
+  Given a DRC Index of an NVDimm, read N-bytes from the the meta data area
+  associated with it, at a specified offset and copy it to provided buffer.
+  The metadata area stores configuration information such as label information,
+  bad-blocks etc. The metadata area is located out-of-band of NVDimm storage
+  area hence a separate access semantics is provided.
+  Input: drcIndex, offset, data, numBytesToWrite
+  Out: None
+  Description:
+  Given a DRC Index of an NVDimm, write N-bytes from provided buffer at the
+  given offset to the the meta data area associated with the NVDimm.
+  Input: drcIndex, startingScmBlockIndex, numScmBlocksToBind, targetAddress
+  Out: guestMappedAddress, numScmBlockBound
+  Description:
+  Given a DRC-Index of an NVDimm, maps the SCM (Storage Class Memory) blocks to
+  continuous logical addresses in guest physical address space. The HCALL
+  arguments can be used to map partial range of SCM blocks instead of entire
+  NVDimm range to the LPAR.
+  Input: drcIndex, startingScmLogicalMemoryAddress, numScmBlocksToUnbind
+  Out: numScmBlocksUnbound
+  Description:
+  Given a DRC-Index of an NVDimm, unmap one or more the SCM blocks from guest
+  physical address space. The HCALL can fail if the Guest has an active PTE
+  entry to the SCM block being unbinded.
+  Input: drcIndex, scmBlockIndex
+  Out: Guest-Physical-Address
+  Description:
+  Given a DRC-Index and an SCM Block index return the guest physical address to
+  which the SCM block is mapped to.
+  Input: Guest-Physical-Address
+  Out: drcIndex, scmBlockIndex
+  Description:
+  Given a guest physical address return which DRC Index and SCM block is mapped
+  to that address.
+  Input: scmTargetScope, drcIndex
+  Out: None
+  Description:
+  Depending on the Target scope unmap all scm blocks belonging to all NVDimms
+  or all scm blocks belonging to a single NVDimm identified by its drcIndex
+  from the LPAR memory.
+  Input: drcIndex
+  Output: health-bitmap, health-bit-valid-bitmap
+  Description:
+  Given a DRC Index return the info on predictive failure and over all health of
+  the NVDimm. The asserted bits in the health-bitmap indicate a single predictive
+  failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
+  valid.
+  Input: drcIndex, resultBuffer Addr
+  Out: None
+  Description:
+  Given a DRC Index collect the performance statistics for NVDimm and copy them
+  to the resultBuffer.
+[1]: "Linux on Power Architecture Platform Reference"
+     https://members.openpowerfoundation.org/document/dl/469

More information about the Linuxppc-dev mailing list