[PATCH] powerpc: POWER7 optimised memcpy using VMX and enhanced prefetch
jimix at pobox.com
Wed Dec 19 00:21:16 EST 2012
On Dec 17, 2012, at 5:33 AM, Anton Blanchard <anton at samba.org> wrote:
> Hi Jimi,
>> I know this is a little late, but shouldn't these power7 specific
>> thingies be in "obj-$(CONFIG_PPC_BOOK3S_64)". The reason I ask is
>> that my compiler pukes on "dcbtst" and as I deal with that I wanted
>> to point this out.
> I guess we could do that.
I think it is the right idea since it is unclear that your optimizations would actually help an embedded system where most of these cache prefetches are NOPs and only wait decode/dispatch cycles.
> It's a bit strange your assembler is
> complaining about the dcbtst instructions since we wrap them with
Not really, the binutils is a little old (RHEL 6.2), unfortunately it _is_ the toolchain most people are using at the moment.
It will take me a while to get everyone using newer ones since most are scientists using the packages they get.
My suggestion was really for correctness, My current patches for BG/Q introduce a macro replacement.
> .machine push
> .machine "power4"
> dcbt r0,r4,0b01000
> dcbt r0,r7,0b01010
> dcbtst r0,r9,0b01000
> dcbtst r0,r10,0b01010
> dcbt r0,r8,0b01010 /* GO */
> .machine pop
More information about the Linuxppc-dev