z constraint in powerpc inline assembly ?

David Laight David.Laight at ACULAB.COM
Fri Jan 17 04:20:14 AEDT 2020


> You mean the mpc8xx , but I'm also using the mpc832x which has a e300c2
> core and is capable of executing 2 insns in parallel if not in the same
> Unit.

That should let you do a memory read and an add.
(I can't remember if the ppc has 'add from memory' but that is
likely to use both units anyway.)
An infinitely unrolled loop will then be 4 clocks/byte (for 32bit).
If you get to 3 for a real loop you are doing ok.

Remember, unroll too much and you displace other code from
the i-cache. Also the i-cache loads themselves kill you.
(A hot-cache benchmark won't see this...)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


More information about the Linuxppc-dev mailing list