[PATCH kernel] powerpc/powernv/ioda: Store correct amount of memory used for table

Alexey Kardashevskiy aik at ozlabs.ru
Wed Feb 13 10:57:12 AEDT 2019



On 12/02/2019 18:33, Alexey Kardashevskiy wrote:
> 
> 
> On 12/02/2019 11:20, David Gibson wrote:
>> On Mon, Feb 11, 2019 at 06:48:01PM +1100, Alexey Kardashevskiy wrote:
>>> We store 2 multilevel tables in iommu_table - one for the hardware and
>>> one with the corresponding userspace addresses. Before allocating
>>> the tables, the iommu_table_group_ops::get_table_size() hook returns
>>> the combined size of the two and VFIO SPAPR TCE IOMMU driver adjusts
>>> the locked_vm counter correctly. When the table is actually allocated,
>>> the amount of allocated memory is stored in iommu_table::it_allocated_size
>>> and used to adjust the locked_vm counter when we release the memory used
>>> by the table; .get_table_size() and .create_table() calculate it
>>> independently but the result is expected to be the same.
>>
>> Any way we can remove that redundant calculation?  That seems like
>> begging for bugs.
> 
> 
> I do not see an easy way. One way could be adding a "dryrun" flag to
> pnv_pci_ioda2_table_alloc_pages(), count allocated memory there and call
> it from .get_table_size() but for multilevel TCEs it only allocates
> first level...


byyyyyy the way even this it_allocated_size is buggy as it is what was
actually allocated so it does not include any indirect levels which
might be allocated later although it should.

I'll simply make it tbl->it_allocated_size=.get_table_size() or just
ditch it_allocated_size, let me see...



> 
> 
>>> Unfortunately the allocator does not add the userspace table size to
>>> ::it_allocated_size so when we destroy the table because of VFIO PCI
>>> unplug (i.e. VFIO container is gone but the userspace keeps running),
>>> we decrement locked_vm by just a half of size of memory we are releasing.
>>> As the result, we leak locked_vm and may not be able to allocate more
>>> IOMMU tables after few iterations of hotplug/unplug.
>>>
>>> This adjusts it_allocated_size if the userspace addresses table was
>>> requested (total_allocated_uas is initialized by zero).
>>>
>>> Fixes: 090bad39b "powerpc/powernv: Add indirect levels to it_userspace"
>>> Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
>>
>> Reviewed-by: David Gibson <david at gibson.dropbear.id.au>
>>
>>> ---
>>>  arch/powerpc/platforms/powernv/pci-ioda-tce.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda-tce.c b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
>>> index 697449a..58146e1 100644
>>> --- a/arch/powerpc/platforms/powernv/pci-ioda-tce.c
>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda-tce.c
>>> @@ -313,7 +313,7 @@ long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset,
>>>  			page_shift);
>>>  	tbl->it_level_size = 1ULL << (level_shift - 3);
>>>  	tbl->it_indirect_levels = levels - 1;
>>> -	tbl->it_allocated_size = total_allocated;
>>> +	tbl->it_allocated_size = total_allocated + total_allocated_uas;
>>>  	tbl->it_userspace = uas;
>>>  	tbl->it_nid = nid;
>>>  
>>
> 

-- 
Alexey


More information about the Linuxppc-dev mailing list