30 bits DMA and ppc

Mon Oct 31 04:59:56 EST 2005

On Sun, Oct 30, 2005 at 03:03:55PM +1100, Benjamin Herrenschmidt wrote:

> However, what I can do is have the architecture code reserve a pool of
> memory at boot if the machine main RAM is bigger than 1Gb, to use for
> bounce-buffering. On the G5 with more than 2Gb, this is even easier
> since I already have to blow away a 16Mb page for use by the IOMMU, but
> the IOMMU only uses 2Mb in there, so I have about 14Mb that I could
> re-use for that. On 32 bits machine, I can just reserve something early
> during boot.

Keep in mind that those 16MB are cache inhibited. Not sure you'd want
that for the bounce buffer. And you can't map the same page as cacheable
or you'll end up in inconsistent state. I guess you could remap the 14MB
as 4K cacheable pages somewhere else.

> Now, how to actually make use of that pool. One way is to hack something
> specific inside the bcm43xx driver. The pool can then be easily cut in
> regions: the descriptor rings buffers, and 2 pools, one for Rx and one
> for Tx. The allocation inside of those pools can be done as simple ring
> buffer too due to the inherently ordered processing of packets.
> 
> However, the above would require arch specific hacks, and would only
> work for one card in the system (too bad if you plug a cardbus one).
> 
> Another possibility that might be more interesting is to use swiotlb.
> This is a somewhat generic bounce-buffering implementation of the DMA
> mapping routines that is used by ia64 and x86_64 when no IOMMU is
> available. It will automatically do nothing if the address fits the DMA
> mask so it shouldn't add much overhead to other drivers and would "make
> things work" transparently. In addition, for G5s with more than 2Gb of
> RAM (which have an iommu), I could modify the iommu code to take into
> account the DMA mask when allocating DMA virtual space. (The later would
> have a slight risk of failure, but I doubt it will happen in practice,
> as it would mean one has more than 1Gb of pending DMA at a given point
> in time).

Some of the Infiniband and Myrinet adapters like to map as much as they
possibly can. I'm not sure what the likeliness of them being used on a
machine at the same time as one of these crippled devices is though.
Besides, they usually back off a bit from allocating everything in the
system, so there should be some room.

> I tend to prefer the later solution ...

Sounds reasonable to me too. I guess time will tell how hairy it gets,
implementation-wise. The implementation could also be nicely abstracted
away and isolated thanks to Stephen's per-device-dma-ops stuff.

-Olof