should cpus_in_xmon be volatile?
linas at austin.ibm.com
Wed Nov 24 05:22:51 EST 2004
On Tue, Nov 23, 2004 at 06:55:35AM +0100, Segher Boessenkool was heard to remark:
> >Well, to make >>that particular<< loop work correctly, the volatile is
> >needed. Why? Because cpus_weight() is extern __bitmap_weight() and
> >its extern, the compiler must be definition invoke it each time in the
> >loop, since the compiler must assume that the called routine is
> >the value of the thing being pointed at. i.e. the call has a
> That's not correct. External linkage is an abstract concept, and by no
> means prevents the compiler from optimising across the boundaries of a
> translation unit (e.g., when performing whole-program optimisation).
> Of course, that's not the current (default) behaviour of GCC, but that
> doesn't make it a correct C program.
> >However, if someone changed the extern __bitmap_weight() to be
> >inline __bitmap_weight(), then the compiler could potentially see that
> >it had no side effects, and decide to optimize away the entire loop.
> It can potentially do that anyway. Nothing in the C standard prevents
> it from doing that.
OK, Here in the US, the holidays are close, so what the heck, another
long academic reply follows.
Yes, of course. If there is a way for a compiler to determine that
a particular call has no side-effects, then it is quite valid for
the optimization to be performed. That's the point I was trying to
make. Since, as far as I know, there aren't any compilers that
actually *do* whole-program optimisation, the distinction is academic.
I guess that come the day that gcc does whole-program optimization,
then the only safe thing to do is to assume that all global vars are
volatile, and that therefore any subroutine acting on a global
does have a side-effect.
However, this potentially plays havoc with function signatures: if
globals are implicitly volatile, then what to do about routines that
take globals as arguments? If the argument to a subroutine is declared
volatile, then the compiler is prevented from doing certain types
of optimizations within that subroutine (e.g. if the argument is used in
a loop). So this kind of naive whole-program optimization would
lead to a massive performance degradation.
The alternative is to explcitly declare globals to be volatile,
and then cast-away volatileness for those subroutines where we
know that its not important.
The third alternative was Paul's patch: pepper the code with
memory barriers wherever they may seem needed. This way, a
whole-program optimizer can assume that globals aren't volatile,
and instead we depend on manually-placed barriers for program
correctness. Which is in the spirit of how things are done today.
More information about the Linuxppc64-dev