83xx GPIO/EXT int in arch/powerpc/
Kumar Gala
galak at kernel.crashing.org
Fri Jun 15 07:04:57 EST 2007
On Jun 12, 2007, at 11:06 AM, Marc Leeman wrote:
>> I'm confused what you are comparing here, 3 seconds on arch=ppc vs
>> over a minute on arch=powerpc?
>
> Loading of a TI DSP over HPI. HPI is implemented in UPMB
> (programmed by
> U-Boot); all that the kernel has to do is write to the in U-Boot
> programmed location (0xe2400000) to trigger the UPMB HPI protocol.
>
>> I'd expect the driver to be exactly the same (or close to it) for
>> arch=ppc vs arch=powerpc.
>
> Same here: there are a number of performance issues wrt to an older
> 8245
> based implementation (network seems slower too) and this was not one I
> expected while switching to powerpc.
Are you comparing 8245 on arch=ppc to 83xx on arch=powerpc or 83xx in
both cases?
>> There shouldn't be, but if you are seeing this we really should
>> figure out what's going on.
>>
>> What kernel is this on? What processor are you using?
>
> $ cat /proc/cpuinfo
> processor : 0
> cpu : e300c1
> clock : 396.000000MHz
> revision : 1.1 (pvr 8083 0011)
> bogomips : 131.28
> timebase : 66000000
> platform : BARCO834x SVC2
Hmm, we really need to put SVR in there as well (add that to the todo
list). Which 83xx is this?
> $ uname -a
> Linux barco 2.6.21.1-barco1 #1 PREEMPT Tue Jun 12 09:48:12 CEST
> 2007 ppc unknown
>
> The board is based on the FreeScale SYS/EMDS reference design.
>
> Most of the HPI operations are stuff like this:
>
> static inline int8_t _hpi_set_hhwil(uint8_t b)
> {
> volatile immap_t* im;
>
> if(!(im = ioremap((immrbar),sizeof(struct immap)))){
> return -EINVAL;
> }
> (b)?(im->gpio[0].dat |= HHWIL):(im->gpio[0].dat &= ~HHWIL);
> iounmap(im);
>
> return 0;
> }
>
> static inline uint32_t __hpi_read_hpid(void)
> {
> uint32_t returnval;
>
> /* Program HPID */
> _hpi_set_hcntl1(1);
> _hpi_set_hcntl0(1);
>
> /* first halfword */
> _hpi_set_hhwil(0);
> /* dummy read */
> returnval = (((uint32_t)(*hpi_dsp))<<16);
>
> /* delay */
> udelay(1);
>
> /* second halfword */
> _hpi_set_hhwil(1);
> /* dummy read */
> returnval |= *hpi_dsp;
>
> /* delay */
> udelay(1);
>
> return returnval;
> }
>
> The ioremap was certainly a bottleneck; and moving it to
> initialisation
> with a global pointer got us from 60 secs back to around 1 sec, but
> the
> a similar effect was obtained on the ppc arch with this change
> (this was
> just making a bad situation to be hidden).
Do you sense of how many calls you make to _hpi_set_hhwil &
__hpi_read_hpid?
I think I know why the ppc case was more efficient.
> Even after this change; the load of a streaming application was
> something of 40% on ppc and 60% on powerpc.
What's going on during the streaming?
- k
More information about the Linuxppc-dev
mailing list