csum_partial() and csum_partial_copy_generic() in badly optimized?
tas at mindspring.com
Mon Nov 18 09:00:26 EST 2002
On Sunday, November 17, 2002, at 07:17 AM, Joakim Tjernlund wrote:
>> CTR and the instructions which operate on it
>> (such as bdnz) were put into the PPC architecture mainly as an
>> optimization opportunity for loops where the loop variable is not used
>> inside the loop body.
> loop variable not USED or loop variable not MODIFIED?
Not used. CTR cannot be specified as the source or destination of most
instructions. In order to access its contents you have to use special
instructions that move between it and a normal general purpose register.
>> Here's a summary of when gcc will compile
>> that crc32 loop with use of CTR and bdnz (note that -O3 or above
>> automatically turn on -funroll-loops, so I saw no point in testing
>> those levels):
>> -O1 -O2 -O1 -funroll-loops -O2 -funroll-loops
>> 2.95.4 no no no no
>> 3.1 no yes yes yes
> hmm, looks like I should upgrade gcc to 3.1 or possibly 3.2. However
> I think that gcc >=3.0 has changed the ABI for C++, which is bad for
Sooner or later you're going to want to, though. :)
> Is 2.95.x still maintained? Maybe this optimization could be added
> to that branch.
I don't know. It probably is to some extent, just because there are
plenty of people in the same boat with you on the C++ ABI changes.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc-dev