[PATCH v3] powerpc/papr_scm: Implement support for H_SCM_FLUSH hcall

Wed Apr 14 15:20:42 AEST 2021

Vaibhav Jain <vaibhav at linux.ibm.com> writes:

> Hi Shiva,
>
> Apologies for a late review but something just caught my eye while
> working on a different patch.
>
> Shivaprasad G Bhat <sbhat at linux.ibm.com> writes:
>
>> Add support for ND_REGION_ASYNC capability if the device tree
>> indicates 'ibm,hcall-flush-required' property in the NVDIMM node.
>> Flush is done by issuing H_SCM_FLUSH hcall to the hypervisor.
>>
>> If the flush request failed, the hypervisor is expected to
>> to reflect the problem in the subsequent nvdimm H_SCM_HEALTH call.
>>
>> This patch prevents mmap of namespaces with MAP_SYNC flag if the
>> nvdimm requires an explicit flush[1].
>>
>> References:
>> [1] https://github.com/avocado-framework-tests/avocado-misc-tests/blob/master/memory/ndctl.py.data/map_sync.c
>>
>> Signed-off-by: Shivaprasad G Bhat <sbhat at linux.ibm.com>
>> ---
>> v2 - https://www.spinics.net/lists/kvm-ppc/msg18799.html
>> Changes from v2:
>>        - Fixed the commit message.
>>        - Add dev_dbg before the H_SCM_FLUSH hcall
>>
>> v1 - https://www.spinics.net/lists/kvm-ppc/msg18272.html
>> Changes from v1:
>>        - Hcall semantics finalized, all changes are to accomodate them.
>>
>>  Documentation/powerpc/papr_hcalls.rst     |   14 ++++++++++
>>  arch/powerpc/include/asm/hvcall.h         |    3 +-
>>  arch/powerpc/platforms/pseries/papr_scm.c |   40 +++++++++++++++++++++++++++++
>>  3 files changed, 56 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/powerpc/papr_hcalls.rst b/Documentation/powerpc/papr_hcalls.rst
>> index 48fcf1255a33..648f278eea8f 100644
>> --- a/Documentation/powerpc/papr_hcalls.rst
>> +++ b/Documentation/powerpc/papr_hcalls.rst
>> @@ -275,6 +275,20 @@ Health Bitmap Flags:
>>  Given a DRC Index collect the performance statistics for NVDIMM and copy them
>>  to the resultBuffer.
>>  
>> +**H_SCM_FLUSH**
>> +
>> +| Input: *drcIndex, continue-token*
>> +| Out: *continue-token*
>> +| Return Value: *H_SUCCESS, H_Parameter, H_P2, H_BUSY*
>> +
>> +Given a DRC Index Flush the data to backend NVDIMM device.
>> +
>> +The hcall returns H_BUSY when the flush takes longer time and the hcall needs
>> +to be issued multiple times in order to be completely serviced. The
>> +*continue-token* from the output to be passed in the argument list of
>> +subsequent hcalls to the hypervisor until the hcall is completely serviced
>> +at which point H_SUCCESS or other error is returned by the hypervisor.
>> +
> Does the hcall semantic also include measures to trigger a barrier/fence
> (like pm_wmb()) so that all the stores before the hcall are gauranteed
> to have hit the pmem controller ?

It is upto the hypervisor to implement the right sequence to ensure the
guarantee. The hcall doesn't expect the store to reach the platform
buffers.

>
> If not then the papr_scm_pmem_flush() introduced below should issue a
> fence like pm_wmb() before issuing the flush hcall.
>
> If yes then this behaviour should also be documented with the hcall
> semantics above.
>

IIUC fdatasync on the backend file is enough to ensure the
guaraantee. Such a request will have REQ_PREFLUSH and REQ_FUA which will
do the necessary barrier on the backing device holding the backend file.

-aneesh