[Cbe-oss-dev] spufs: kernel hangs by polling 'mfc' w/o proxy DMA request
Arnd Bergmann
arnd at arndb.de
Wed Sep 12 23:00:44 EST 2007
On Friday 07 September 2007, Kazunori Asayama wrote:
> I found that the kernel hangs if programs poll 'mfc' node of SPUFS
> without proxy DMA requests. The reason why this problem occurs is
> that:
>
> - If spufs_mfc_poll, which is the 'poll' operator of 'mfc', is
> called without proxy DMA requests, spufs_mfc_poll issues a proxy
> tag group query with query mask = 0 and query type = 2 (all):
>
> ctx->ops->set_mfc_query(ctx, ctx->tagwait, 2);
>
> The processor immediately raises a 'tag-group completion
> interrupt' corresponding to this query, because there is no
> outstanding proxy DMA at all.
>
> - The spufs_mfc_poll never (regardless of other conditions) returns
> POLLIN event when tagwait (a set of issued proxy DMA) is zero:
>
> if (tagstatus & ctx->tagwait)
> mask |= POLLIN | POLLRDNORM;
>
> - As a result of the above, spufs_mfc_poll endlessly issues proxy
> tag group queries with query mask = 0 and query type = 2, if once
> spufs_mfc_poll is called without proxy DMA request.
Oh, you mean it gets into a busy-loop? That should really not
happen.
I suppose we should immediately return from the loop when a program
accidentally calls poll() without having entered any requests into
the queue first, like
if (!ctx->tagwait)
mask |= POLLERR;
and have the read function on the file return -EINVAL.
> My questions are:
>
> - Why does spufs_mfc_poll issue queries with query type = 2 ? It
> seems strange that this condition (all) is different from
> spufs_mfc_read's one (any).
>
> I guess that this is just a workaround to implement the
> 'SPE_TAG_ALL' behavior of spe_mfcio_tag_status_read without using
> incomplete fsync implementation.
Don't remember why it was done, but it seems strange now. User space
could still call poll()/read() repeatedly until all DMAs are
done to implement SPE_TAG_ALL from user space, even if we use
query type 1.
> I remember that when we discussed how the behavior flags of
> spe_mfcio_tag_status_read should be implemented, we reached the
> conclusion as:
>
> * behaviors of operations on 'mfc' node:
>
> - blocking read on 'mfc'
>
> blocks until at least one of the DMAs completes and then
> returns all currently complete tag groups.
>
> - non-blocking read on 'mfc'
>
> reads the current Prxy_TagStatus and masks it with tagwait.
>
> - fsync on 'mfc'
>
> blocks until all DMAs (tagwait) are complete.
actually, this was disabled at some point, and never put back:
#if 0
/* this currently hangs */
·······ret = spufs_wait(ctx->mfc_wq,
····················· ctx->ops->set_mfc_query(ctx, ctx->tagwait, 2));
·······if (ret)
··············goto out;
·······ret = spufs_wait(ctx->mfc_wq,
····················· ctx->ops->read_mfc_tagstatus(ctx) == ctx->tagwait);
out:
#else
·······ret = 0;
#endif
I have no idea why it would hang though,
> - poll on 'mfc'
>
> returns whenever any one of the DMAs completes.
>
> * implementations of libspe
>
> - spe_mfcio_tag_status_read with SPE_TAG_ALL: fsync then read.
>
> - spe_mfcio_tag_status_read with SPE_TAG_ANY: blocking read.
>
> - spe_mfcio_tag_status_read with SPE_TAG_IMMEDIATE: non-blocking read.
>
> - events: poll
yes, sounds right.
> - If this my understanding is correct, is it OK that we will fix the
> fsync implementation then we will change the poll behavior so that
> it waits for 'any' condition?
>
> I think that there is no problem with doing so, because the
> libspe2's spe_mfcio_tag_status_read already has code to call fsync
> on 'mfc' for SPE_TAG_ALL and the libspe2 will still work correctly
> after this change.
yes.
> - How should the SPUFS behave with tagwait = 0?
>
> I think that followings are reasonable and useful for application
> programs:
>
> - poll: waits until new proxy DMAs will be issued and any of them
> will complete.
As mentioned above, I think returning POLLERR would be more appropriate.
> - fsync: returns immediately.
yes, that would be good
Arnd <><
More information about the cbe-oss-dev
mailing list