[PATCH 2/2] powerpc/nvdimm: use H_SCM_QUERY hcall on H_OVERLAP error

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Thu Aug 29 18:21:33 AEST 2019


On 8/29/19 1:29 PM, Oliver O'Halloran wrote:
> On Thu, Aug 29, 2019 at 4:34 PM Aneesh Kumar K.V
> <aneesh.kumar at linux.ibm.com> wrote:
>>
>> Right now we force an unbind of SCM memory at drcindex on H_OVERLAP error.
>> This really slows down operations like kexec where we get the H_OVERLAP
>> error because we don't go through a full hypervisor re init.
> 
> Maybe we should be unbinding it on a kexec().
> 

shouldn't ?

>> H_OVERLAP error for a H_SCM_BIND_MEM hcall indicates that SCM memory at
>> drc index is already bound. Since we don't specify a logical memory
>> address for bind hcall, we can use the H_SCM_QUERY hcall to query
>> the already bound logical address.
> 
> This is a little sketchy since we might have crashed during the
> initial bind. Checking if the last block is bound to where we expect
> it to be might be a good idea. If it's not where we expect it to be,
> then an unbind->bind cycle is the only sane thing to do.
> 


I would not have expected hypervisor to not mark the drc index bound if 
we failed the previous BIND request.

I can query start block and last block logical address and check whether 
the full blocks is indeed mapped.


>> Boot time difference with and without patch is:
>>
>> [    5.583617] IOMMU table initialized, virtual merging enabled
>> [    5.603041] papr_scm ibm,persistent-memory:ibm,pmemory at 44104001: Retrying bind after unbinding
>> [  301.514221] papr_scm ibm,persistent-memory:ibm,pmemory at 44108001: Retrying bind after unbinding
>> [  340.057238] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 failures), 275 descs
> 
> Is the unbind significantly slower than a bind? Or is the region here
> just massive?
> 

on unbind. We go two regions one of 60G and other of 10G


>> after fix
>>
>> [    5.101572] IOMMU table initialized, virtual merging enabled
>> [    5.116984] papr_scm ibm,persistent-memory:ibm,pmemory at 44104001: Querying SCM details
>> [    5.117223] papr_scm ibm,persistent-memory:ibm,pmemory at 44108001: Querying SCM details
>> [    5.120530] hv-24x7: read 1530 catalog entries, created 537 event attrs (0 failures), 275 descs
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
>> ---
>>   arch/powerpc/platforms/pseries/papr_scm.c | 26 ++++++++++++++++++++---
>>   1 file changed, 23 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c
>> index 220e595cb579..4b74cfe7b334 100644
>> --- a/arch/powerpc/platforms/pseries/papr_scm.c
>> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
>> @@ -110,6 +110,27 @@ static void drc_pmem_unbind(struct papr_scm_priv *p)
>>          return;
>>   }
>>
>> +static int drc_pmem_query(struct papr_scm_priv *p)
>> +{
>> +       unsigned long ret[PLPAR_HCALL_BUFSIZE];
>> +       int64_t rc;
>> +
>> +
>> +       rc = plpar_hcall(H_SCM_QUERY_BLOCK_MEM_BINDING, ret,
>> +                        p->drc_index, 0);
>> +
>> +       if (rc) {
>> +               dev_err(&p->pdev->dev, "Failed to bind SCM");
>> +               return rc;
>> +       }
>> +
>> +       p->bound_addr = ret[0];
>> +       dev_dbg(&p->pdev->dev, "bound drc 0x%x to %pR\n", p->drc_index, &p->res);
>> +
>> +       return 0;
>> +}
>> +
>> +
>>   static int papr_scm_meta_get(struct papr_scm_priv *p,
>>                               struct nd_cmd_get_config_data_hdr *hdr)
>>   {
>> @@ -431,9 +452,8 @@ static int papr_scm_probe(struct platform_device *pdev)
>>
>>          /* If phyp says drc memory still bound then force unbound and retry */
>>          if (rc == H_OVERLAP) {
>> -               dev_warn(&pdev->dev, "Retrying bind after unbinding\n");
>> -               drc_pmem_unbind(p);
>> -               rc = drc_pmem_bind(p);
> 
>> +               dev_warn(&pdev->dev, "Querying SCM details\n");
> 
> That's a pretty vague message. If we're going to treat leaving the
> region bound over kexec() as normal then you might want to bump it
> down to pr_info() or so.

sure.

> 
>> +               rc = drc_pmem_query(p);
>>          }
>>
>>          if (rc != H_SUCCESS) {
>> --
>> 2.21.0
>>
> 



More information about the Linuxppc-dev mailing list