[Cbe-oss-dev] [PATCH 1/3] Fix Unlikely(x) == y
Adrian Bunk
bunk at kernel.org
Tue Feb 19 00:56:09 EST 2008
On Sun, Feb 17, 2008 at 10:50:03PM +1100, Michael Ellerman wrote:
> On Sat, 2008-02-16 at 10:39 -0800, Arjan van de Ven wrote:
>...
> > for mordern (last 10 years) x86 cpus... the cpu branchpredictor is better
> > than the coder in general. Same for most other architectures.
> >
> > unlikely() creates bigger code as well.
> >
> > Now... we're talking about your super duper hotpath function here right?
> > One where you care about 0.5 cycle speed improvement? (less on modern
> > systems ;)
>
> The first patch was to platforms/ps3 code, which runs on the Cell, in
> particular the PPE ... which is not an x86 :)
>
> eg:
>
> [michael at schoenaich ~]$ cat branch.c
> #include <stdio.h>
> int main(void)
> {
> int i, j;
>
> for (i = 0, j = 0; i < 1000000000; i++)
> if (i % 4 == 0)
> j++;
>
> printf("j = %d\n", j);
> return 0;
> }
> [michael at schoenaich ~]$ ppu-gcc -Wall -O3 -o branch branch.c
> [michael at schoenaich ~]$ time ./branch
> real 0m5.172s
>
> [michael at schoenaich ~]$ cat branch.c
> ..
> for (i = 0, j = 0; i < 1000000000; i++)
> if (__builtin_expect(i % 4 == 0, 0))
> j++;
> ..
> [michael at schoenaich ~]$ ppu-gcc -Wall -O3 -o branch branch.c
> [michael at schoenaich ~]$ time ./branch
> real 0m3.762s
>
>
> Which looks as though unlikely() is helping. Admittedly we don't have a
> lot of kernel code that looks like that, but at least unlikely is doing
> what we want it to.
This means it generates faster code with a current gcc for your platform.
But a future gcc might e.g. replace the whole loop with a division
(gcc SVN head (that will soon become gcc 4.3) already does
transformations like replacing loops with divisions [1]).
And your __builtin_expect() then might have unwanted effects on gcc.
Or the kernel code changes much but the likely/unlikely stays unchanged
although it becomes wrong.
If it is a real hotpath in the kernel where you have _measurable_
performance advantages from using likely/unlikely it's usage might be
justified, but otherwise it shouldn't be used.
> cheers
cu
Adrian
[1] e.g. the while() loop in timespec_add_ns() in include/linux/time.h
gets replaced by a division and a modulo (whether this
transformation is correct in this specific case is a different
question, but that's the level of code transformation gcc already
does today)
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
More information about the cbe-oss-dev
mailing list