[PATCH v8 RFC 1/3] sparc: Break up monolithic iommu table/lock into finer graularity pools and lock

David Laight David.Laight at ACULAB.COM
Wed Apr 1 02:15:33 AEDT 2015


From: Sowmini Varadhan
> Investigation of multithreaded iperf experiments on an ethernet
> interface show the iommu->lock as the hottest lock identified by
> lockstat, with something of the order of  21M contentions out of
> 27M acquisitions, and an average wait time of 26 us for the lock.
> This is not efficient. A more scalable design is to follow the ppc
> model, where the iommu_map_table has multiple pools, each stretching
> over a segment of the map, and with a separate lock for each pool.
> This model allows for better parallelization of the iommu map search.

I've wondered whether the iommu setup for ethernet receive (in particular)
could be made much more efficient if there were a function that
would unmap one buffer and map a second buffer?
My thought is that iommu pte entry used by the old buffer could just
be modified to reference the new one.
In effect each ring entry would end up using a fixed iommu pte.

The other question is how much data can be copied in 26us ?
On iommu systems 'copybreak' limits on receive and transmit
may need to be quite high.

	David



More information about the Linuxppc-dev mailing list