[RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce

Kenneth Lee liguozhu at hisilicon.com
Sat Nov 24 15:13:21 AEDT 2018

On Fri, Nov 23, 2018 at 11:05:04AM -0700, Jason Gunthorpe wrote:
> Date: Fri, 23 Nov 2018 11:05:04 -0700
> From: Jason Gunthorpe <jgg at ziepe.ca>
> To: Kenneth Lee <liguozhu at hisilicon.com>
> CC: Leon Romanovsky <leon at kernel.org>, Kenneth Lee <nek.in.cn at gmail.com>,
>  Tim Sell <timothy.sell at unisys.com>, linux-doc at vger.kernel.org, Alexander
>  Shishkin <alexander.shishkin at linux.intel.com>, Zaibo Xu
>  <xuzaibo at huawei.com>, zhangfei.gao at foxmail.com, linuxarm at huawei.com,
>  haojian.zhuang at linaro.org, Christoph Lameter <cl at linux.com>, Hao Fang
>  <fanghao11 at huawei.com>, Gavin Schenk <g.schenk at eckelmann.de>, RDMA mailing
>  list <linux-rdma at vger.kernel.org>, Zhou Wang <wangzhou1 at hisilicon.com>,
>  Doug Ledford <dledford at redhat.com>, Uwe Kleine-König
>  <u.kleine-koenig at pengutronix.de>, David Kershner
>  <david.kershner at unisys.com>, Johan Hovold <johan at kernel.org>, Cyrille
>  Pitchen <cyrille.pitchen at free-electrons.com>, Sagar Dharia
>  <sdharia at codeaurora.org>, Jens Axboe <axboe at kernel.dk>,
>  guodong.xu at linaro.org, linux-netdev <netdev at vger.kernel.org>, Randy Dunlap
>  <rdunlap at infradead.org>, linux-kernel at vger.kernel.org, Vinod Koul
>  <vkoul at kernel.org>, linux-crypto at vger.kernel.org, Philippe Ombredanne
>  <pombredanne at nexb.com>, Sanyog Kale <sanyog.r.kale at intel.com>, "David S.
>  Miller" <davem at davemloft.net>, linux-accelerators at lists.ozlabs.org
> Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce
> User-Agent: Mutt/1.9.4 (2018-02-28)
> Message-ID: <20181123180504.GA3395 at ziepe.ca>
> On Fri, Nov 23, 2018 at 04:02:42PM +0800, Kenneth Lee wrote:
> > It is already part of Jean's patchset. And that's why I built my solution on
> > VFIO in the first place. But I think the concept of SVA and PASID is not
> > compatible with the original VFIO concept space. You would not share your whole
> > address space to a device at all in a virtual machine manager,
> > wouldn't you?
> Why not? That seems to fit VFIO's space just fine to me.. You might
> need a new upcall to create a full MM registration, but that doesn't
> seem unsuited.

Because the VM manager (such as qemu) do not want to share its whole space to
the device. It is a security problem.

> Part of the point here is you should try to make sensible revisions to
> existing subsystems before just inventing a new thing...
> VFIO is deeply connected to the IOMMU, so enabling more general IOMMU
> based approache seems perfectly fine to me..
> > > Once the VFIO driver knows about this as a generic capability then the
> > > device it exposes to userspace would use CPU addresses instead of DMA
> > > addresses.
> > > 
> > > The question is if your driver needs much more than the device
> > > agnostic generic services VFIO provides.
> > > 
> > > I'm not sure what you have in mind with resource management.. It is
> > > hard to revoke resources from userspace, unless you are doing
> > > kernel syscalls, but then why do all this?
> > 
> > Say, I have 1024 queues in my accelerator. I can get one by opening the device
> > and attach it with the fd. If the process exit by any means, the queue can be
> > returned with the release of the fd. But if it is mdev, it will still be there
> > and some one should tell the allocator it is available again. This is not easy
> > to design in user space.
> ?? why wouldn't the mdev track the queues assigned using the existing
> open/close/ioctl callbacks?
> That is basic flow I would expect:
>  open(/dev/vfio)
>  ioctl(unity map entire process MM to mdev with IOMMU)
>  // Create a HQ queue and link the PASID in the HW to this HW queue
>  struct hw queue[..];
>  ioctl(create HW queue)
>  // Get BAR doorbell memory for the queue
>  bar = mmap()
>  // Submit work to the queue using CPU addresses
>  queue[0] = ...
>  writel(bar [..], &queue);
>  // Queue, SVA, etc is cleaned up when the VFIO closes
>  close()

This is not the way that you can use mdev. To use mdev, you have to:

1. unbind kernel driver from the device, and rebind it to vfio driver
2. for 0 to 1204: uuid > /sys/.../the_dev/mdev/create to create all the mdev
3. a virtual iommu_group will be created in /dev/vfio/* from every mdev

now you can do this in you application (even without considering the pasid) :

	container = open(/dev/vfio);
	ioctl(container, settting);
	group = open(/dev/vfio/my_group_for_particular_mdev);
	ioctl(container, attach_group, group);
	device = ioctl(group, get_device);
	ioctl(container, set_dma_operation);

Then you have to make a decision, how can you find a available mdev for use and
how to return it.

We have considered creating only one mdev and allocating queue when the device
is openned. But the VFIO maintainer, Alex, did not agree and said it broke the
VFIO origin idea.

> Presumably the kernel has to handle the PASID and related for security
> reasons, so they shouldn't go to userspace?
> If there is something missing in vfio to do this is it looks pretty
> small to me..
> Jason


This e-mail and its attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed above.
Any use of the 
information contained herein in any way (including, but not limited to, total or
partial disclosure, reproduction, or dissemination) by persons other than the
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender by phone or email immediately and delete it!

More information about the Linux-accelerators mailing list