30 bits DMA and ppc

Olof Johansson olof at lixom.net
Mon Oct 31 08:35:17 EST 2005


On Mon, Oct 31, 2005 at 08:14:18AM +1100, Benjamin Herrenschmidt wrote:
> 
> > Keep in mind that those 16MB are cache inhibited. Not sure you'd want
> > that for the bounce buffer. And you can't map the same page as cacheable
> > or you'll end up in inconsistent state. I guess you could remap the 14MB
> > as 4K cacheable pages somewhere else.
> 
> Euh... I think that's exactly what we do :) We _unmap_ the 16Mb page
> from the linear mapping, and we remap a part of it using ioremap() (thus
> as 4k pages) in the DART driver... The remaining bits are thus not
> mapped at all, there is no problem using __ioremap() to get a cacheable
> mapping there.

Sure, that would work.

> > Some of the Infiniband and Myrinet adapters like to map as much as they
> > possibly can. I'm not sure what the likeliness of them being used on a
> > machine at the same time as one of these crippled devices is though.
> 
> Why would this be a problem ? The infiniband driver would hopefully have
> a sane dma_mask, and thus it's mapping requests wouldn't hit the swiotlb
> code path.

Not a problem, I was just thinking out loud. IOMMU pressure might be
higher on these systems than the average one, but there should still be
enough room.

> > Sounds reasonable to me too. I guess time will tell how hairy it gets,
> > implementation-wise. The implementation could also be nicely abstracted
> > away and isolated thanks to Stephen's per-device-dma-ops stuff.
> 
> Yes, though that's not strictly necessary. The dma_mask should be enough
> to tell us wether to use the normal code path or the swiotlb one. So if
> swiotlb is enabled, it could just be added to the normal code path for
> the non-iommu case.

The non-iommu case might want to do that for other devices as well,
i.e. 32-bit limited ones on 64-bit machines.

> For the iommu case, I'm not sure. I think we don't
> need bounce buffering. I could fairly easily have the DART code limit
> allocations to a given DMA mask. The TCE one may have more issues since
> the DMA window of a given slot may not fit the requirements, but in that
> case, it's probably just a matter of failing all mapping requests.

Yep, the table is already split once, and I'm not sure in retrospect how
useful that split was anyway, it can maybe go away. Switching it around
shouldn't be a big issue.


-Olof




More information about the Linuxppc-dev mailing list