[PATCH 8/9] iommu: add support for ARM Ltd. System MMU architecture

Joerg Roedel joro at 8bytes.org
Sat Jun 22 00:13:07 EST 2013


On Fri, Jun 21, 2013 at 11:23:18AM +0100, Will Deacon wrote:
> The results were that the memory-to-memory DMA didn't show any corruption. I
> also managed to tickle access faults by messing around with the permissions,
> then remap the buffers and resume the transfers.

That sounds pretty conclusive. So when real hardware shows up it should
work reasonably well.

> > > +static struct arm_smmu_device *find_parent_smmu(struct arm_smmu_device *smmu)
> > > +{
> > > +	struct arm_smmu_device *parent, *tmp;
> > > +
> > > +	if (!smmu->parent_of_node)
> > > +		return NULL;
> > > +
> > > +	list_for_each_entry_safe(parent, tmp, &arm_smmu_devices, list)
> > > +		if (parent->dev->of_node == smmu->parent_of_node)
> > > +			return parent;
> > 
> > Why do you need the _safe variant here? You are not changing the list in
> > this loop so you should be fine with list_for_each_entry().
> 
> For a system with multiple SMMUs (regardless of chaining), couldn't this
> code run in parallel with probing of another SMMU (which has to add to the
> arm_smmu_devices list)? The same applies for device removal, which could
> perhaps be driven from some power-managment code.

Well, the '_safe' does not mean it is safe from concurrent list
manipulations. If you want to protect from that you still need a lock.
The '_safe' variant only allows to remove the current element from the
list while traversing it.

> > > +	do {
> > > +		phys_addr_t output_mask = (1ULL << smmu->s2_output_size) - 1;
> > > +		if ((phys_addr_t)iova & ~output_mask)
> > > +			return -ERANGE;
> > > +	} while ((smmu = find_parent_smmu(smmu)));
> > 
> > This looks a bit too expensive to have in the map path. How about saving
> > something like an effective_output_mask (or output_size) which contains
> > the logical OR of every mask up the path? This would make this check a
> > lot cheaper.
> 
> As mentioned in the DT binding thread, it's rare that this loop would
> execute more than once, and largely inconceivable that it would execute more
> than twice, so I don't know how much we need to worry about the cost.

But still, this code is a challenge for the branch-predictor, plus
the additional function calls to find_parent_smmu(). I still think it is
worth to optimize this away. The map function is supposed to be a
fast-path function.


	Joerg




More information about the devicetree-discuss mailing list