[PATCH 5/5 v11] iommu/fsl: Freescale PAMU driver and iommu implementation.
Sethi Varun-B16395
B16395 at freescale.com
Wed Apr 3 16:12:16 EST 2013
> -----Original Message-----
> From: Wood Scott-B07421
> Sent: Wednesday, April 03, 2013 7:23 AM
> To: Timur Tabi
> Cc: Joerg Roedel; Sethi Varun-B16395; lkml; Kumar Gala; Yoder Stuart-
> B08248; iommu at lists.linux-foundation.org; Benjamin Herrenschmidt;
> linuxppc-dev at lists.ozlabs.org
> Subject: Re: [PATCH 5/5 v11] iommu/fsl: Freescale PAMU driver and iommu
> implementation.
>
> On 04/02/2013 08:35:54 PM, Timur Tabi wrote:
> > On Tue, Apr 2, 2013 at 11:18 AM, Joerg Roedel <joro at 8bytes.org> wrote:
> >
> > > > + panic("\n");
> > >
> > > A kernel panic seems like an over-reaction to an access violation.
> >
> > We have no way to determining what code caused the violation, so we
> > can't just kill the process. I agree it seems like overkill, but what
> > else should we do? Does the IOMMU layer have a way for the IOMMU
> > driver to stop the device that caused the problem?
>
> At a minimum, log a message and continue. Probably turn off the LIODN,
> at least if it continues to be noisy (otherwise we could get stuck in an
> interrupt storm as you note). Possibly let the user know somehow,
> especially if it's a VFIO domain.
[Sethi Varun-B16395] Can definitely log the message and disable the LIODN (to avoid an interrupt storm), but
we definitely need a mechanism to inform vfio subsystem about the error. Also, disabling LIODN may not be a viable
option with the new LIODN allocation scheme (where LIODN would be associated with a domain).
>
> Don't take down the whole kernel. It's not just overkill; it undermines
> VFIO's efforts to make it safe for users to control devices.
>
> > > Besides the device that caused the violation the system should still
> > > work, no?
> >
> > Not really. The PAMU was designed to add IOMMU support to legacy
> > devices, which have no concept of an MMU. If the PAMU detects an
> > access violation, there's no way for the device to recover, because it
> > has no idea that a violation has occurred. It's going to keep on
> > writing to bad data.
>
> I think that's only the case for posted writes (or devices which fail to
> take a hint and stop even after they see an I/O error).
>
[Sethi Varun-B16395] Even in the case where the guest driver detects a failure, it may not be able to fix the problem without intervention from the VFIO subsystem.
-Varun
More information about the Linuxppc-dev
mailing list