[PATCH] powerpc: POWER7 optimised memcpy using VMX and enhanced prefetch

Jimi Xenidis jimix at pobox.com
Wed Dec 19 00:21:16 EST 2012


On Dec 17, 2012, at 5:33 AM, Anton Blanchard <anton at samba.org> wrote:

> 
> Hi Jimi,
> 
>> I know this is a little late, but shouldn't these power7 specific
>> thingies be in "obj-$(CONFIG_PPC_BOOK3S_64)". The reason I ask is
>> that my compiler pukes on "dcbtst" and as I deal with that I wanted
>> to point this out.
> 
> I guess we could do that.

I think it is the right idea since it is unclear that your optimizations would actually help an embedded system where most of these cache prefetches are NOPs and only wait decode/dispatch cycles.

> It's a bit strange your assembler is
> complaining about the dcbtst instructions since we wrap them with
> power4:

Not really, the binutils is a little old (RHEL 6.2), unfortunately it _is_ the toolchain most people are using at the moment.
It will take me a while to get everyone using newer ones since most are scientists using the packages they get.

My suggestion was really for correctness,  My current patches for BG/Q introduce a macro replacement.
-jx


> 
> .machine push
> .machine "power4"
>        dcbt    r0,r4,0b01000
>        dcbt    r0,r7,0b01010
>        dcbtst  r0,r9,0b01000
>        dcbtst  r0,r10,0b01010
>        eieio
>        dcbt    r0,r8,0b01010   /* GO */
> .machine pop
> 
> Anton



More information about the Linuxppc-dev mailing list