On Wed, Aug 05, 2015 at 03:29:35PM +0200, Christophe Leroy wrote: > On the 8xx, load latency is 2 cycles and taking branches also takes > 2 cycles. So let's unroll the loop. This is not true for most other 32-bit PowerPC; this patch makes performance worse on e.g. 6xx/7xx/7xxx. Let's not! Segher