[Cbe-oss-dev] [RFC/PATCH] libspe2: use mapped dma registers to speed up proxy dma

Kazunori Asayama asayama at sm.sony.co.jp
Thu May 24 19:55:13 EST 2007


"D. Herrendoerfer" <d.herrendoerfer at herrendoerfer.name> wrote:
> On Wed, 2007-05-23 at 17:57 +0900, Kazunori Asayama wrote:
> [snip]
> > 
> > BTW, in this implementation, DMA proxy commands are always issued via
> > direct access when SPE_MAP_PS is enabled. However, if DMA proxy
> > commands are issued via direct access, we can't wait for completion of
> > the DMAs via syscalls (poll/epoll). That means we can no longer wait
> > for DMA completion by using libspe2 event API when SPE_MAP_PS is
> > enabled. E.g., such a restriction makes it impossible to create
> > applications which use event API to wait for PPE-initiated DMAs and do
> > SPE-SPE communication via SNR. So I think we may have to introduce a
> > new separate flag from SPE_MAP_PS to enable this optimized behavior,
> > so that each application can choose preferable behavior.
> > 
> > --
> > (ASAYAMA Kazunori
> >   (asayama at sm.sony.co.jp))
> 
> Indeed, we discussed this briefly yesterday - but for another reason:
> In HPC uses it might be preferable to turn of synchronization (locking)
> in the libspe code, and have the application take care of this manually.
> Since this approach brings a DMA throughput gain of over 100% it might
> make sense to also add this option.

Yes, I have no objection to this optimization. My concern is how to
enable the optimization by applications. Now, we have three options:

  - always if SPE_MAP_PS is set
    (the current implementation)

    Both of SPE_MAP_PS and SPE_EVENT_TAG_GROUP can not be used at the
    same time, however, applications can use this optimized DMA and
    all other events at the same time.

  - if SPE_MAP_PS is set and SPE_EVENTS_ENABLE is not set
    (Gerhard's suggestion)

    Both of SPE_MAP_PS and SPE_EVENT_TAG_GROUP can be used at the same
    time, but applications can not use the optimized DMA even if the
    applications don't use SPE_EVENT_TAG_GROUP.

  - introduce another new 'explicit' flag to enable the optimized DMA,
    such as SPE_PS_DMA

    It seems flexible, but complecated.

(and other alternatives ?)

Honestly, I'm not sure which is the best one and have no strong
opinion, because I don't have enough 'real' use cases of DMA proxy.

--
(ASAYAMA Kazunori
  (asayama at sm.sony.co.jp))
t



More information about the cbe-oss-dev mailing list