[Cbe-oss-dev] SPU-GCC optimzation problem
Trevor_Smigiel at PlayStation.Sony.com
Trevor_Smigiel at PlayStation.Sony.com
Wed Mar 12 08:59:50 EST 2008
Ulrich,
Yes, removing that change, and another in combine.c, are the correct
fix. I've made those changes in the 3C repository.
Trevor
* Ulrich Weigand <Ulrich.Weigand at de.ibm.com> [2008-03-11 09:05]:
> Hello Trevor,
>
> please see Machida-san's bug report below. It looks like this is a
> problem that came in with your big PS3/FSF backport merge to the 3C
> repository (rev. 770). Before that, spu_rlmaskqwbyte would use an
> UNSPEC_SPU_ROTQBY unspec which never got optimized. After that change,
> the intrinsic is open-coded in terms of an LSHIFTRT rtx.
>
> This would be perfectly fine normally, as long as SHIFT_COUNT_TRUNCATED is
> not defined (which is isn't on spu). However, there is special CELL
> LOCAL code in simplify-rtx.c:simplify_binary_operation_1 which uses an
> #ifdef SPU hack to *always* truncate shift counts anyway.
>
> There is some comment around this CELL LOCAL section that seems to say
> this is required to fix a bug in some test case. I'm wondering whether
> this statement is still correct after the rev. 770 changes to the back-end
> ... Reverting this CELL LOCAL change fixes the problem reported below.
> Can you advise whether this is right thing to do? Thanks!
>
>
> Machida-san, thanks for reporting the problem!
>
>
> Mit freundlichen Gruessen / Best Regards
>
> Ulrich Weigand
>
> --
> Dr. Ulrich Weigand | Phone: +49-7031/16-3727
> GNU compiler/toolchain for Linux on System z and Cell BE
> IBM Deutschland Entwicklung GmbH
> Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung:
> Herbert Kircher
> Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
> Stuttgart, HRB 243294
>
>
>
> "Hiroyuki Machida" <Hiroyuki.Mach at gmail.com>
> Sent by: cbe-oss-dev-bounces+ulrich.weigand=de.ibm.com at ozlabs.org
> 03/11/08 10:10 AM
> Please respond to
> Hiroyuki.Mach at gmail.com
>
>
> To
> cbe-oss-dev at ozlabs.org
> cc
>
> Subject
> [Cbe-oss-dev] SPU-GCC optimzation problem
>
>
>
>
>
>
> Hi,
>
> I got a problem with SPU-GCC in IBM Cell SDK 3.0.
> I could not reproduce this with SPU-GCC in older Cell
> SDK 2.1 or 2.0. I attached details, below.
>
>
> Hiroyuki.
>
> ---
>
> * Summary
>
> The spu-gcc generates incorrect code for spu_rlmaskqwbyte intrinsic
> when:
>
> -- compiled with "-O1" or higher optimization level and
>
> -- the second argument is a constant (immediate) value and
>
> -- 1 <= (the second argument) mod 32 <= 16
>
> * Version
>
> IBM Cell SDK 3.0 spu-gcc, spu-g++
>
> IBM Cell SDK 2.1 or earlier don't have this problem.
>
> * Sample code
>
> ---
> #include <spu_intrinsics.h>
>
> vector unsigned int source = { 0x11111111, 0x22222222, 0x33333333,
> 0x44444444, };
>
> int main(int argc, char **argv)
> {
> vector unsigned int result;
>
> result = spu_rlmaskqwbyte(source, -17);
>
> /* all elements should be zero. */
> printf("0x%08x 0x%08x 0x%08x 0x%08x, \n",
> spu_extract(result, 0),
> spu_extract(result, 1),
> spu_extract(result, 2),
> spu_extract(result, 3));
>
> return 0;
> }
> ---
>
> * Additional information
>
> It seems that when optimization is enabled, the second argument is
> normalized into -15 to 0.
>
> ---
> main:
> ila $3,.LC0
> hbrr .L3,printf
> lqr $2,source
> stqd $lr,16($sp)
> stqd $sp,-32($sp)
> ai $sp,$sp,-32
> nop 127
> rotqmbyi $7,$2,-1 # <==========
> ori $4,$7,0
> rotqbyi $5,$7,(1*4+0)%16
> rotqbyi $6,$7,(2*4+0)%16
> rotqbyi $7,$7,(3*4+0)%16
> nop 127
> .L3:
> brsl $lr,printf
> ai $sp,$sp,32
> fsmbi $3,0
> lqd $lr,16($sp)
> bi $lr
> ---
> _______________________________________________
> cbe-oss-dev mailing list
> cbe-oss-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/cbe-oss-dev
>
>
> _______________________________________________
> cbe-oss-dev mailing list
> cbe-oss-dev at ozlabs.org
> https://ozlabs.org/mailman/listinfo/cbe-oss-dev
More information about the cbe-oss-dev
mailing list