[PATCH] uacce: Add uacce_ctrl misc device

Zhou Wang wangzhou1 at hisilicon.com
Fri Jan 22 22:41:23 AEDT 2021


On 2021/1/21 19:52, Song Bao Hua (Barry Song) wrote:
> 
> 
>> -----Original Message-----
>> From: Greg Kroah-Hartman [mailto:gregkh at linuxfoundation.org]
>> Sent: Friday, January 22, 2021 12:19 AM
>> To: Song Bao Hua (Barry Song) <song.bao.hua at hisilicon.com>
>> Cc: Wangzhou (B) <wangzhou1 at hisilicon.com>; Zhangfei Gao
>> <zhangfei.gao at linaro.org>; Arnd Bergmann <arnd at arndb.de>;
>> linux-accelerators at lists.ozlabs.org; linux-kernel at vger.kernel.org;
>> chensihang (A) <chensihang1 at hisilicon.com>
>> Subject: Re: [PATCH] uacce: Add uacce_ctrl misc device
>>
>> On Thu, Jan 21, 2021 at 10:18:24AM +0000, Song Bao Hua (Barry Song) wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Greg Kroah-Hartman [mailto:gregkh at linuxfoundation.org]
>>>> Sent: Thursday, January 21, 2021 10:46 PM
>>>> To: Wangzhou (B) <wangzhou1 at hisilicon.com>
>>>> Cc: Zhangfei Gao <zhangfei.gao at linaro.org>; Arnd Bergmann <arnd at arndb.de>;
>>>> linux-accelerators at lists.ozlabs.org; linux-kernel at vger.kernel.org;
>>>> chensihang (A) <chensihang1 at hisilicon.com>
>>>> Subject: Re: [PATCH] uacce: Add uacce_ctrl misc device
>>>>
>>>> On Thu, Jan 21, 2021 at 05:09:14PM +0800, Zhou Wang wrote:
>>>>> When IO page fault happens, DMA performance will be affected. Pin user
>> page
>>>>> can avoid IO page fault, this patch introduces a new char device named
>>>>> /dev/uacce_ctrl to help to maintain pin/unpin pages. User space can do
>>>>> pin/unpin pages by ioctls of an open file of /dev/uacce_ctrl, all pinned
>>>>> pages under one file will be unpinned in file release process.
>>>>
>>>> Also, what are you really trying to do here?  If you need to mess with
>>>> memory pages, why can't the existing memory apis work properly for you?
>>>> Please work with the linux-mm developers to resolve the issue using the
>>>> standard apis and not creating a one-off char device node for this type
>>>> of thing.
>>>
>>> Basically the purpose is implementing a pinned memory poll for userspace
>>> DMA to achieve better performance by removing io page fault.
>>
>> And what could possibly go wrong with that :)
> 
> I think we have resolved this concern while uacce came in :-)
> Uacce is based on SVA so devices are accessing memory in userspace
> by strict permission control.
> 
>>
>>> I really like this can be done in generic mm code. Unfortunately there is
>> no
>>> this standard API in kernel to support userspace pin. Right now, various
>>> subsystems depend on the ioctl of /dev/<name> to implement the pin, for example,
>>> v4l2, gpu, infiniband, media etc.
>>>
>>> I feel it is extremely hard to sell a standard mpin() API like mlock()
>>> for this stage as mm could hardly buy this. And it will require
>>> huge changes in kernel.
>>
>> Why?  This is what mlock() is for, why can't you use it?
> 
> mlock() can only guarantee memory won't be swapped out, it doesn't
> make sure memory won't move. alloc_pages() can cause memory compaction,
> cma, numa balance, huge pages etc can move mlock()-ed pages.
> We would still see many I/O page faults for mlock() area.
> 
>>
>>> We need a way to manage what pages are pinned by process and ensure the
>>> pages can be unpinned while the process is killed abnormally. otherwise,
>>> memory gets leaked.
>>
>> Can't mlock() handle that?  It works on the process that called it.
>>
>>> file_operations release() is a good entry for this kind of things. In
>>> this way, we don't have to maintain the pinned page set in task_struct
>>> and unpin them during exit().
>>>
>>> If there is anything to make it better by doing this in a driver. I
>>> would believe we could have a generic misc driver for pin like
>>> vms_ballon.c for ballon. The driver doesn't have to bind with uacce.
>>>
>>> In this way, the pinned memory pool implementation in userspace doesn't
>>> need to depend on a specific uacce driver any more.
>>
>> Please work with the mm developers to get them to agree with this type
>> of thing, as well as the dma developers, both of which you didn't cc: on
>> this patch :(
> 
> Yep.

OK. Will send v2 with cc mm-list and iommu-list too, let's see if we could
find a proper way to solve this problem :)

Best,
Zhou

> 
>>
>> Remember, you are creating a new api for Linux that goes around existing
>> syscalls, but is in reality, a new syscall, so why not just make it a
>> new syscall?
> 
> The difficulty would be how to record which pages are pinned for a process
> if it is done by a new syscall.
> 
> For mlock(), it can be much easier as it will change VMA. Hardly we can
> change VMA for pin. On the other hand, if the implementation is done in
> driver, with file_operations, we can record pinned pages in the private
> data of an opened file.
> 
>>
>> thanks,
>>
>> greg k-h
> 
> Thanks
> Barry
> 
> .
> 



More information about the Linux-accelerators mailing list