[PATCH kernel v9 18/32] powerpc/iommu/powernv: Release replaced TCE
    David Gibson 
    david at gibson.dropbear.id.au
       
    Thu Apr 30 14:21:07 AEST 2015
    
    
  
On Wed, Apr 29, 2015 at 07:51:21PM +1000, Alexey Kardashevskiy wrote:
> On 04/29/2015 02:18 PM, David Gibson wrote:
> >On Sat, Apr 25, 2015 at 10:14:42PM +1000, Alexey Kardashevskiy wrote:
> >>At the moment writing new TCE value to the IOMMU table fails with EBUSY
> >>if there is a valid entry already. However PAPR specification allows
> >>the guest to write new TCE value without clearing it first.
> >>
> >>Another problem this patch is addressing is the use of pool locks for
> >>external IOMMU users such as VFIO. The pool locks are to protect
> >>DMA page allocator rather than entries and since the host kernel does
> >>not control what pages are in use, there is no point in pool locks and
> >>exchange()+put_page(oldtce) is sufficient to avoid possible races.
> >>
> >>This adds an exchange() callback to iommu_table_ops which does the same
> >>thing as set() plus it returns replaced TCE and DMA direction so
> >>the caller can release the pages afterwards. The exchange() receives
> >>a physical address unlike set() which receives linear mapping address;
> >>and returns a physical address as the clear() does.
> >>
> >>This implements exchange() for P5IOC2/IODA/IODA2. This adds a requirement
> >>for a platform to have exchange() implemented in order to support VFIO.
> >>
> >>This replaces iommu_tce_build() and iommu_clear_tce() with
> >>a single iommu_tce_xchg().
> >>
> >>This makes sure that TCE permission bits are not set in TCE passed to
> >>IOMMU API as those are to be calculated by platform code from DMA direction.
> >>
> >>This moves SetPageDirty() to the IOMMU code to make it work for both
> >>VFIO ioctl interface in in-kernel TCE acceleration (when it becomes
> >>available later).
> >>
> >>Signed-off-by: Alexey Kardashevskiy <aik at ozlabs.ru>
> >>[aw: for the vfio related changes]
> >>Acked-by: Alex Williamson <alex.williamson at redhat.com>
> >
> >This looks mostly good, but there are couple of details that need fixing.
> >
> 
> 
> [...]
> 
> >>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> >>index ba75aa5..e8802ac 100644
> >>--- a/arch/powerpc/platforms/powernv/pci.c
> >>+++ b/arch/powerpc/platforms/powernv/pci.c
> >>@@ -598,6 +598,23 @@ int pnv_tce_build(struct iommu_table *tbl, long index, long npages,
> >>  	return 0;
> >>  }
> >>
> >>+#ifdef CONFIG_IOMMU_API
> >>+int pnv_tce_xchg(struct iommu_table *tbl, long index,
> >>+		unsigned long *tce, enum dma_data_direction *direction)
> >>+{
> >>+	u64 proto_tce = iommu_direction_to_tce_perm(*direction);
> >>+	unsigned long newtce = *tce | proto_tce;
> >>+	unsigned long idx = index - tbl->it_offset;
> >
> >Should this have a BUG_ON or WARN_ON if the supplied tce has bits set
> >below the page mask?
> 
> 
> Why? The caller checks these bits, do we really need to duplicate it
> here?
Because this is the crunch point where bad bits will actually cause
strange stuff to happen.
As much as anything the point of a BUG_ON would be to document that
this function expects the parameter to be aligned.
> 
> 
> >>+	*tce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce));
> >>+	*tce = be64_to_cpu(*tce);
> >>+	*direction = iommu_tce_direction(*tce);
> >>+	*tce &= ~(TCE_PCI_READ | TCE_PCI_WRITE);
> >>+
> >>+	return 0;
> >>+}
> >>+#endif
> 
> 
> 
-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20150430/8647bfd8/attachment.sig>
    
    
More information about the Linuxppc-dev
mailing list