[Cbe-oss-dev] [RFC/PATCH] libspe2: use mapped dma registers to speed up proxy dma
Kazunori Asayama
asayama at sm.sony.co.jp
Wed May 23 18:57:51 EST 2007
stenzel at de.ibm.com wrote:
> The following patch speeds up the MFC proxy command functions by
> writing to the dma registers directly via mapped problem state
> if the context was created with the SPE_MAP_PS flag.
>
> Comments are appreciated
Here are my comments:
> ===================================================================
> Index: libspe2/spebase/dma.c
> ===================================================================
> --- libspe2/spebase/dma.c (revision 39)
> +++ libspe2/spebase/dma.c (working copy)
(snip)
> -static int spe_read_tag_status_block(spe_context_ptr_t spectx, unsigned int *tag_status)
> +static int spe_read_tag_status_block(spe_context_ptr_t spectx, unsigned int mask, unsigned int *tag_status)
> {
> int fd;
> + volatile struct spe_mfc_command_area *cmd_area =
> + spectx->base_private->mfc_mmap_base;
>
> if (spectx->base_private->flags & SPE_MAP_PS) {
> - // fixme
> - errno = ENOTSUP;
> + _base_spe_context_lock(spectx, FD_MFC);
> + cmd_area->Prxy_QueryMask = mask;
> + __asm__ ("eieio");
> + while (*tag_status ^ mask)
> + *tag_status = cmd_area->Prxy_TagStatus;
'*tag_status' is referred before assignment.
It must be initialized at first:
--
*tag_status = 0;
while (*tag_status ^ mask)
*tag_status = cmd_area->Prxy_TagStatus;
--
(snip)
> @@ -266,8 +311,8 @@ static int spe_read_tag_status_noblock(s
> int _base_spe_mfcio_tag_status_read(spe_context_ptr_t spectx, unsigned int mask, unsigned int behavior, unsigned int *tag_status)
> {
> if ( mask != 0 ) {
> - errno = ENOTSUP;
> - return -1;
> + if (!(spectx->base_private->flags & SPE_MAP_PS))
> + mask = 0;
> }
The value zero (special meaning 'all outstanding tags') is not
interpreted properly as 'mask' parameter when SPE_MAP_PS is enabled. I
think that we should support such a case or add an implementation note
about this restriction to the libspe2 spec.
BTW, in this implementation, DMA proxy commands are always issued via
direct access when SPE_MAP_PS is enabled. However, if DMA proxy
commands are issued via direct access, we can't wait for completion of
the DMAs via syscalls (poll/epoll). That means we can no longer wait
for DMA completion by using libspe2 event API when SPE_MAP_PS is
enabled. E.g., such a restriction makes it impossible to create
applications which use event API to wait for PPE-initiated DMAs and do
SPE-SPE communication via SNR. So I think we may have to introduce a
new separate flag from SPE_MAP_PS to enable this optimized behavior,
so that each application can choose preferable behavior.
--
(ASAYAMA Kazunori
(asayama at sm.sony.co.jp))
t
More information about the cbe-oss-dev
mailing list