[PATCH 8/9] iommu: add support for ARM Ltd. System MMU architecture
Joerg Roedel
joro at 8bytes.org
Sat Jun 22 00:13:07 EST 2013
On Fri, Jun 21, 2013 at 11:23:18AM +0100, Will Deacon wrote:
> The results were that the memory-to-memory DMA didn't show any corruption. I
> also managed to tickle access faults by messing around with the permissions,
> then remap the buffers and resume the transfers.
That sounds pretty conclusive. So when real hardware shows up it should
work reasonably well.
> > > +static struct arm_smmu_device *find_parent_smmu(struct arm_smmu_device *smmu)
> > > +{
> > > + struct arm_smmu_device *parent, *tmp;
> > > +
> > > + if (!smmu->parent_of_node)
> > > + return NULL;
> > > +
> > > + list_for_each_entry_safe(parent, tmp, &arm_smmu_devices, list)
> > > + if (parent->dev->of_node == smmu->parent_of_node)
> > > + return parent;
> >
> > Why do you need the _safe variant here? You are not changing the list in
> > this loop so you should be fine with list_for_each_entry().
>
> For a system with multiple SMMUs (regardless of chaining), couldn't this
> code run in parallel with probing of another SMMU (which has to add to the
> arm_smmu_devices list)? The same applies for device removal, which could
> perhaps be driven from some power-managment code.
Well, the '_safe' does not mean it is safe from concurrent list
manipulations. If you want to protect from that you still need a lock.
The '_safe' variant only allows to remove the current element from the
list while traversing it.
> > > + do {
> > > + phys_addr_t output_mask = (1ULL << smmu->s2_output_size) - 1;
> > > + if ((phys_addr_t)iova & ~output_mask)
> > > + return -ERANGE;
> > > + } while ((smmu = find_parent_smmu(smmu)));
> >
> > This looks a bit too expensive to have in the map path. How about saving
> > something like an effective_output_mask (or output_size) which contains
> > the logical OR of every mask up the path? This would make this check a
> > lot cheaper.
>
> As mentioned in the DT binding thread, it's rare that this loop would
> execute more than once, and largely inconceivable that it would execute more
> than twice, so I don't know how much we need to worry about the cost.
But still, this code is a challenge for the branch-predictor, plus
the additional function calls to find_parent_smmu(). I still think it is
worth to optimize this away. The map function is supposed to be a
fast-path function.
Joerg
More information about the devicetree-discuss
mailing list