[Cbe-oss-dev] [RFC/PATCH] libspe2: use mapped dma registers to speed up proxy dma

Kazunori Asayama asayama at sm.sony.co.jp
Wed May 23 18:57:51 EST 2007


stenzel at de.ibm.com wrote:
> The following patch speeds up the MFC proxy command functions by
> writing to the dma registers directly via mapped problem state
> if the context was created with the SPE_MAP_PS flag.
> 
> Comments are appreciated

Here are my comments:

> ===================================================================
> Index: libspe2/spebase/dma.c
> ===================================================================
> --- libspe2/spebase/dma.c	(revision 39)
> +++ libspe2/spebase/dma.c	(working copy)
(snip)
> -static int spe_read_tag_status_block(spe_context_ptr_t spectx, unsigned int *tag_status)
> +static int spe_read_tag_status_block(spe_context_ptr_t spectx, unsigned int mask, unsigned int *tag_status)
>  {
>  	int fd;
> +	volatile struct spe_mfc_command_area *cmd_area = 
> +                spectx->base_private->mfc_mmap_base;
>  
>  	if (spectx->base_private->flags & SPE_MAP_PS) {
> -		// fixme
> -		errno = ENOTSUP;
> +		_base_spe_context_lock(spectx, FD_MFC);
> +		cmd_area->Prxy_QueryMask = mask;
> +		__asm__ ("eieio");
> +		while  (*tag_status ^ mask) 
> +			*tag_status =  cmd_area->Prxy_TagStatus;

'*tag_status' is referred before assignment.
It must be initialized at first:
--
		*tag_status = 0;
		while  (*tag_status ^ mask) 
			*tag_status =  cmd_area->Prxy_TagStatus;
--

(snip)
> @@ -266,8 +311,8 @@ static int spe_read_tag_status_noblock(s
>  int _base_spe_mfcio_tag_status_read(spe_context_ptr_t spectx, unsigned int mask, unsigned int behavior, unsigned int *tag_status)
>  {
>  	if ( mask != 0 ) {
> -		errno = ENOTSUP;
> -		return -1;
> +		if (!(spectx->base_private->flags & SPE_MAP_PS)) 
> +			mask = 0;
>  	}

The value zero (special meaning 'all outstanding tags') is not
interpreted properly as 'mask' parameter when SPE_MAP_PS is enabled. I
think that we should support such a case or add an implementation note
about this restriction to the libspe2 spec.


BTW, in this implementation, DMA proxy commands are always issued via
direct access when SPE_MAP_PS is enabled. However, if DMA proxy
commands are issued via direct access, we can't wait for completion of
the DMAs via syscalls (poll/epoll). That means we can no longer wait
for DMA completion by using libspe2 event API when SPE_MAP_PS is
enabled. E.g., such a restriction makes it impossible to create
applications which use event API to wait for PPE-initiated DMAs and do
SPE-SPE communication via SNR. So I think we may have to introduce a
new separate flag from SPE_MAP_PS to enable this optimized behavior,
so that each application can choose preferable behavior.

--
(ASAYAMA Kazunori
  (asayama at sm.sony.co.jp))
t



More information about the cbe-oss-dev mailing list