[PATCH 1/7] mm/memory.c: Fix race when faulting a device private page

Michael Ellerman mpe at ellerman.id.au
Thu Sep 29 15:07:17 AEST 2022


Alistair Popple <apopple at nvidia.com> writes:
> Michael Ellerman <mpe at ellerman.id.au> writes:
>> Alistair Popple <apopple at nvidia.com> writes:
>>> When the CPU tries to access a device private page the migrate_to_ram()
>>> callback associated with the pgmap for the page is called. However no
>>> reference is taken on the faulting page. Therefore a concurrent
>>> migration of the device private page can free the page and possibly the
>>> underlying pgmap. This results in a race which can crash the kernel due
>>> to the migrate_to_ram() function pointer becoming invalid. It also means
>>> drivers can't reliably read the zone_device_data field because the page
>>> may have been freed with memunmap_pages().
>>>
>>> Close the race by getting a reference on the page while holding the ptl
>>> to ensure it has not been freed. Unfortunately the elevated reference
>>> count will cause the migration required to handle the fault to fail. To
>>> avoid this failure pass the faulting page into the migrate_vma functions
>>> so that if an elevated reference count is found it can be checked to see
>>> if it's expected or not.
>>>
>>> Signed-off-by: Alistair Popple <apopple at nvidia.com>
>>> ---
>>>  arch/powerpc/kvm/book3s_hv_uvmem.c       | 15 ++++++-----
>>>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17 +++++++------
>>>  drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |  2 +-
>>>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c     | 11 +++++---
>>>  include/linux/migrate.h                  |  8 ++++++-
>>>  lib/test_hmm.c                           |  7 ++---
>>>  mm/memory.c                              | 16 +++++++++++-
>>>  mm/migrate.c                             | 34 ++++++++++++++-----------
>>>  mm/migrate_device.c                      | 18 +++++++++----
>>>  9 files changed, 87 insertions(+), 41 deletions(-)
>>>
>>> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c
>>> index 5980063..d4eacf4 100644
>>> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c
>>> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
>>> @@ -508,10 +508,10 @@ unsigned long kvmppc_h_svm_init_start(struct kvm *kvm)
...
>>> @@ -994,7 +997,7 @@ static vm_fault_t kvmppc_uvmem_migrate_to_ram(struct vm_fault *vmf)
>>>
>>>  	if (kvmppc_svm_page_out(vmf->vma, vmf->address,
>>>  				vmf->address + PAGE_SIZE, PAGE_SHIFT,
>>> -				pvt->kvm, pvt->gpa))
>>> +				pvt->kvm, pvt->gpa, vmf->page))
>>>  		return VM_FAULT_SIGBUS;
>>>  	else
>>>  		return 0;
>>
>> I don't have a UV test system, but as-is it doesn't even compile :)
>
> Ugh, thanks. I did get as far as installing a PPC cross-compiler and
> building a kernel. Apparently I did not get as far as enabling
> CONFIG_PPC_UV :)

No worries, that's really on us. If we're going to keep the code in the
tree then it should really be enabled in at least one of our defconfigs.

cheers


More information about the Linuxppc-dev mailing list