[Cbe-oss-dev] [RFC/PATCH] libspe2: use mapped dma registers to speed up proxy dma
Gerhard Stenzel
gerhard.stenzel at de.ibm.com
Sat Jun 2 01:23:13 EST 2007
Kazunori Asayama <asayama at sm.sony.co.jp> wrote on 05/24/2007 11:55:13 AM:
> "D. Herrendoerfer" <d.herrendoerfer at herrendoerfer.name> wrote:
> > On Wed, 2007-05-23 at 17:57 +0900, Kazunori Asayama wrote:
> > [snip]
> > >
> > > BTW, in this implementation, DMA proxy commands are always issued via
> > > direct access when SPE_MAP_PS is enabled. However, if DMA proxy
> > > commands are issued via direct access, we can't wait for completion
of
> > > the DMAs via syscalls (poll/epoll). That means we can no longer wait
> > > for DMA completion by using libspe2 event API when SPE_MAP_PS is
> > > enabled. E.g., such a restriction makes it impossible to create
> > > applications which use event API to wait for PPE-initiated DMAs and
do
> > > SPE-SPE communication via SNR. So I think we may have to introduce a
> > > new separate flag from SPE_MAP_PS to enable this optimized behavior,
> > > so that each application can choose preferable behavior.
> > >
> > > --
> > > (ASAYAMA Kazunori
> > > (asayama at sm.sony.co.jp))
> >
> > Indeed, we discussed this briefly yesterday - but for another reason:
> > In HPC uses it might be preferable to turn of synchronization (locking)
> > in the libspe code, and have the application take care of this
manually.
> > Since this approach brings a DMA throughput gain of over 100% it might
> > make sense to also add this option.
Apologies for not responding earlier as I was busy with other stuff.
>
> Yes, I have no objection to this optimization. My concern is how to
> enable the optimization by applications. Now, we have three options:
>
> - always if SPE_MAP_PS is set
> (the current implementation)
>
> Both of SPE_MAP_PS and SPE_EVENT_TAG_GROUP can not be used at the
> same time, however, applications can use this optimized DMA and
> all other events at the same time.
>
> - if SPE_MAP_PS is set and SPE_EVENTS_ENABLE is not set
> (Gerhard's suggestion)
>
> Both of SPE_MAP_PS and SPE_EVENT_TAG_GROUP can be used at the same
> time, but applications can not use the optimized DMA even if the
> applications don't use SPE_EVENT_TAG_GROUP.
>
> - introduce another new 'explicit' flag to enable the optimized DMA,
> such as SPE_PS_DMA
>
> It seems flexible, but complecated.
>
> (and other alternatives ?)
>
> Honestly, I'm not sure which is the best one and have no strong
> opinion, because I don't have enough 'real' use cases of DMA proxy.
How about the following:
If a context is created with SPE_MAP_PS and SPE_EVENT_ENABLE, the
eventhandler will not be allowed to register for the SPE_EVENT_TAG_GROUP
event?
We should stay compatible to previous version (the MFC function were not
supported with SPE_MAP_PS so far) and if an application is modified to use
SPE_MAP_PS, it will also have to handle the DMA completion itself.
This patch illustrates it:
diff -u -r1.2 spe_event.c
--- speevent/spe_event.c 16 Apr 2007 13:24:54 -0000 1.2
+++ speevent/spe_event.c 1 Jun 2007 15:14:36 -0000
@@ -221,6 +221,12 @@
_event_spe_context_unlock(event->spe);
return -1;
}
+
+ if (event->spe->base_private->flags & SPE_MAP_PS) {
+ _event_spe_context_unlock(event->spe);
+ errno = ENOTSUP;
+ return -1;
+ }
ev_buf = &evctx->events[__SPE_EVENT_TAG_GROUP];
ev_buf->events = SPE_EVENT_TAG_GROUP;
If that sounds like an acceptable compromise, I will send an updated patch
next week.
>
> --
> (ASAYAMA Kazunori
> (asayama at sm.sony.co.jp))
> t
Best regards,
Gerhard Stenzel, Linux on Cell Development, LTC
-----------------------------------------------------------------------------------------------------------------------------------
IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung: Herbert
Kircher
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht Stuttgart,
HRB 243294
More information about the cbe-oss-dev
mailing list