[PATCH v2 3/4] fsl-dma: change the release process of dma descriptor
Ira W. Snyder
iws at ovro.caltech.edu
Thu Jul 12 02:30:58 EST 2012
On Wed, Jul 11, 2012 at 05:01:25PM +0800, Qiang Liu wrote:
> Modify the release process of dma descriptor for avoiding exception when
> enable config NET_DMA, release dma descriptor from 1st to last second, the
> last descriptor which is reserved in current descriptor register may not be
> completed, race condition will be raised if free current descriptor.
>
> A race condition which is raised when use both of talitos and dmaengine to
> offload xor is because napi scheduler (NET_DMA is enabled) will sync all
> pending requests in dma channels, it affects the process of raid operations.
> The descriptor is freed which is submitted just now, but async_tx must check
> whether this depend tx descriptor is acked, there are poison contents in the
> invalid address, then BUG_ON() is thrown, so this descriptor will be freed
> in the next time.
>
This patch seems to be covering up a bug in the driver, rather than
actually fixing it.
When it was written, it was expected that dma_do_tasklet() would run
only when the controller was idle.
> Cc: Dan Williams <dan.j.williams at intel.com>
> Cc: Vinod Koul <vinod.koul at intel.com>
> Cc: Li Yang <leoli at freescale.com>
> Signed-off-by: Qiang Liu <qiang.liu at freescale.com>
> ---
> drivers/dma/fsldma.c | 15 ++++++++++++---
> 1 files changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
> index 4f2f212..0ba3e40 100644
> --- a/drivers/dma/fsldma.c
> +++ b/drivers/dma/fsldma.c
> @@ -1035,14 +1035,22 @@ static irqreturn_t fsldma_chan_irq(int irq, void *data)
> static void dma_do_tasklet(unsigned long data)
> {
> struct fsldma_chan *chan = (struct fsldma_chan *)data;
> - struct fsl_desc_sw *desc, *_desc;
> + struct fsl_desc_sw *desc, *_desc, *prev = NULL;
> LIST_HEAD(ld_cleanup);
> unsigned long flags;
> + dma_addr_t curr_phys = get_cdar(chan);
>
> chan_dbg(chan, "tasklet entry\n");
>
> spin_lock_irqsave(&chan->desc_lock, flags);
>
> + /* find the descriptor which is already completed */
> + list_for_each_entry_safe(desc, _desc, &chan->ld_running, node) {
> + if (prev && desc->async_tx.phys == curr_phys)
> + break;
> + prev = desc;
> + }
> +
If the DMA controller was still busy processing transactions, you should
have gotten the printout "irq: controller not idle!" from
fsldma_chan_irq() just before it scheduled the dma_do_tasklet() to run.
If you did not get this printout, how was dma_do_tasklet() entered with
the controller still busy? I don't understand how it can happen.
If you test without your spin_lock_bh() and spin_unlock_bh() conversion
patch, do you still hit the error?
What happens if a user submits exactly one DMA transaction, and then
leaves the system idle? The callback for the last descriptor in the
chain will never get run, right? That's a bug.
> /* update the cookie if we have some descriptors to cleanup */
> if (!list_empty(&chan->ld_running)) {
> dma_cookie_t cookie;
> @@ -1058,13 +1066,14 @@ static void dma_do_tasklet(unsigned long data)
> * move the descriptors to a temporary list so we can drop the lock
> * during the entire cleanup operation
> */
> - list_splice_tail_init(&chan->ld_running, &ld_cleanup);
> + list_cut_position(&ld_cleanup, &chan->ld_running, &prev->node);
>
> /* the hardware is now idle and ready for more */
> chan->idle = true;
>
> /*
> - * Start any pending transactions automatically
> + * Start any pending transactions automatically if current descriptor
> + * list is completed
> *
> * In the ideal case, we keep the DMA controller busy while we go
> * ahead and free the descriptors below.
> --
> 1.7.5.1
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev at lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
More information about the Linuxppc-dev
mailing list