Bad gcc-4.1.0 leads to Power4 crashes... and power5 too, actually

Sat Dec 23 17:28:31 EST 2006

On Wed, Dec 20, 2006 at 03:19:31PM -0600, Linas Vepstas wrote:
> On Tue, Dec 19, 2006 at 07:46:50PM -0600, Peter Bergner wrote:
> > On Tue, 2006-12-19 at 18:46 -0600, Linas Vepstas wrote:
> > > Per xchat, here's the update. I'm guessing I'm using a broken
> > > compiler, as per chain of evidence below ...
> > [snip]
> > > However, I also note that the following scrolled by:
> > > init/main.c:81:2: warning: #warning gcc-4.1.0 is known to miscompile the
> > > kernel. A different compiler version is recommended.
> > 
> > It may be due to this GCC bug which Olaf ran into a while back:
> > 
> >   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24644
> > 
> > You can verify whether you have a broken compiler by compiling
> > the minimal test case I posted in comment #15.  If you see r13
> > being copied into another register and then used, then you have
> > a broken compiler.
> 
> No, that's not it. I'd be surprised, as I was using the SuSE
> SLES10 gcc-4.1.0-28.4.ppc.rpm compiler, which would have that fix.

Hmm, this looks like at problem Paul forwarded on to me, originally
reported by Hugh Dickins <hugh at veritas.com>.  In his email, Hugh said:

> I spent too long looking in the wrong direction (head_64.S and entry_64.S),
> then noticed this in generic_file_aio_read from "objdump -rd mm/filemap.o":
>     3b54:	7d a5 6b 78 	mr      r5,r13
>     3b58:	38 c0 00 00 	li      r6,0
>     3b5c:	7c 09 03 a6 	mtctr   r0
>     3b60:	38 e0 00 00 	li      r7,0
>     3b64:	39 00 00 00 	li      r8,0
>     3b68:	eb a3 00 20 	ld      r29,32(r3)
>     3b6c:	48 00 00 48 	b       3bb4 <.generic_file_aio_read+0xa4>
>     3b70:	e9 49 00 08 	ld      r10,8(r9)
>     3b74:	7c e7 52 14 	add     r7,r7,r10
>     3b78:	7c e9 53 79 	or.     r9,r7,r10
>     3b7c:	41 c0 01 88 	blt-    3d04 <.generic_file_aio_read+0x1f4>
>     3b80:	e9 25 01 a0 	ld      r9,416(r5)
> 
> So, if the task is preempted and rescheduled on a different cpu in between
> the first and the last line, r5 will be looking at a different paca_struct
> from the one we're now on, and pick up the wrong __current.  (Well, there's
> a branch in the middle there, which then branches back: so the flow isn't
> quite as I've shown, but the effect is the same.)
> 
> That's compiled on SuSE 10.1, gcc 4.1.0-25 (with CONFIG_CC_OPTIMIZE_FOR_SIZE,
> but I've since checked that the same kind of thing happens without).  In most
> places it does use the expected 416(r13) for current, but occasionally via an
> intermediate register as here: why it should choose to do it that way I don't
> know, but assume it's some subtle and legitimate optimization.  It looks as
> if YDL 4.1's older gcc 3.4.4-2 does not do it that way.

I don't know if SuSE's 4.1.0-25 has the PR24644 fix, or whether that
fix cures the mm/filemap.c problem.  I do know that a 4.1.2 20061121
compiler I happened to have lying around made copies of r13 on 2.6.17
mm/filemap.c, even with local_paca made volatile.  The following
workaround allowed me to compile a kernel without any silly r13
copies.

#define get_paca()	({__asm__ __volatile__ ("#paca %0" : "=r" (local_paca)); local_paca;})

The asm tells gcc that local_paca is changed in some unspecified way
just before each access.  Explicitly making r13 volatile like this
should avoid the fuzzy gcc semantics of volatile global register
variables.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre