floating point support in the driver.

Misbah khan misbah_khan at engineer.com
Tue Aug 5 19:49:17 EST 2008


Hi David ,

Thank you for your reply.

I am running the algorithm on OMAP processor (arm-core) and i did tried the
same on iMX processor which takes 1.7 times more than OMAP.

It is true that the algorithm is performing the vector operation which is
blowing the cache .

But the question is How to lock the cache ? In driver how should we
implement the same ?

An example code or a document could be helpful in this regard.

--- Misbah <><

David Hawkins-3 wrote:
> 
> 
> Hi Misbah,
> 
> I would recommend you look at your floating-point code again
> and benchmark each section. You should be able to estimate
> the number of clock cycles required to complete an operation
> and then check that against your measurements.
> 
> Depending on whether your algorithm is processing intensive
> or data movement intensive, you may find that the big time
> waster is moving data on or off chip, or perhaps its a large
> vector operation that is blowing out the cache. If you
> do find that, then on some processors you can lock the
> cache, so your algorithm would require a custom driver
> that steals part of the cache from the OS, but the floating point
> code would not run in the kernel, it would run on data
> stored in the stolen cache area. You can lock both instructions
> and data in the cache; eg. an FFT routine can be locked in
> the instruction cache, while FFT data is in the data cache.
> I'm not sure how easy this is to do under Linux though.
> 
> Here's an example of the level of detail you can get
> downto when benchmarking code:
> 
> http://www.ovro.caltech.edu/~dwh/correlator/pdf/dsp_programming.pdf
> 
> The FFT routine used on this processor made use of both
> the instruction and data cache (on-chip SRAM) on the
> DSP.
> 
> This code is being re-developed to run on a MPC8349EA PowerPC
> with FPU. I did some initial testing to confirm that the
> FPU operates as per the data sheet, and will eventually get
> around to more complete testing.
> 
> Which processor were you running your code on, and what
> frequency were you operating the processor at? How does
> the algorithm timing compare when run on other processors,
> eg. your desktop or laptop machine?
> 
> Cheers,
> Dave
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded
> 
> 

-- 
View this message in context: http://www.nabble.com/floating-point-support-in-the-driver.-tp18772109p18827857.html
Sent from the linuxppc-embedded mailing list archive at Nabble.com.



More information about the Linuxppc-embedded mailing list