glibc-2.5 test suite hangs/crashes the machine

Steve Munroe sjmunroe at us.ibm.com
Thu Nov 2 09:17:28 EST 2006



Benjamin Herrenschmidt <benh at kernel.crashing.org> wrote on 10/29/2006
07:47:05 PM:

> On Fri, 2006-10-27 at 12:22 -0400, Jeff Bailey wrote:
> > Le vendredi 27 octobre 2006 à 07:56 +0200, Fabio Massimo Di Nitto a
> > écrit :
> > > Hi everybody,
> > >
> > > i am in the process of bootstrapping the new toolchain for ubuntu and
I am
> > > hitting a problem building glibc-2.5 on ppc.
> > >
> > > This behaviour has been reproduced on 2.6.15/2.6.17 and 2.6.19-
> rc2 (where the
> > > machine crashes) and with ppc32 and ppc64 kernels.
> > > A hard reboot of the machine is required to get rid of the Zl
> processes hanging
> > > around that keep spinning the CPU at 100%.
> > >
> > > I did place sources here: http://people.ubuntu.com/~fabbione/benh/
> > >
> > > but i start to believe it is a kernel bug we are exploiting only now.
> > >
> > > Any hint or help for what to look for would be extremely appreciated.
> >
> > Heya Fabio, just an update, it looks like the tests that are zombie'ing
> > are the nptl tst-robust[1-8] tests.  According to /proc/##/wchan, the
> > tasks are cheerfully spinning in do_exit.
>
> So I've built that glibc with debian 2.6.16 kernel headers (since Fabio
> says the problem doesn't happen with glibc built with 2.6.19 headers)
> and have ran that with 2.6.19-rc3-git-du-jour.
>
> The machine didn't crash, nor did I see any zombie with those
> tst-robust[1-8], however, I did get as SIGBUS with tst-robustpi1. I've
> tracked it down to being an alignment exception. It looks like glibc is
> doing a lwarx on a non-aligned value, though I can't say precisely
> what's up here. I don't know how I can get a backtrace when running
> those test-cases... the test harness seems to catch signals, I suppose
> it could be modified to spit one out.
>
> At this point, it would be useful to have somebody who knows glibc to
> tell us:
>
>  - what are those tst-robust all about ? (what do they do "special" that
> might trigger bad reactions with older kernels)
>  - how can glibc ever do atomic operations on a non-aligned value ?
>
> Ben.
>
The tst-robustpi# test are exercising the new PTHREAD_MUXTEX_ROBUST api,
with PTHREAD_PRIO_INHERIT attribute.

The fuxtex word seems to include the waiters TID, I don't know if the
kernel cares about this or not.


Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center




More information about the Linuxppc-dev mailing list