[PROBLEM] Soft lockup on Linux 2.6.27, 2 patches, Cell/PPC64
Geert Uytterhoeven
Geert.Uytterhoeven at sonycom.com
Wed Oct 15 20:46:54 EST 2008
On Wed, 15 Oct 2008, Benjamin Herrenschmidt wrote:
> On Wed, 2008-10-15 at 11:25 +0200, Geert Uytterhoeven wrote:
> > On Wed, 15 Oct 2008, Benjamin Herrenschmidt wrote:
> > > On Tue, 2008-10-14 at 11:32 +0200, Geert Uytterhoeven wrote:
> > > > which points again to smp_call_function_single...
> > >
> > > Yup, it doesn't bring more information. At this stage, your 'other' CPU
> > > is stuck with interrupts disabled. Hard to tell what's happening without
> > > some HW assist. Do you have ways to trigger a non-maskable interrupt
> > > such as a 0x100 ? That would allow to catch the other guy in xmon and
> > > see what it was doing...
> >
> > Interrupts are not disabled on the other CPU thread, at least not according to
> > the irqs_disabled() check I added to the printing of the `spinlock lockup'
> > message in __spin_lock_debug().
> >
> > As the log also said
> >
> > | hardirqs last enabled at (5018779): [<c000000000007c1c>] restore+0x1c/0xe4
> > | hardirqs last disabled at (5018780): [<c000000000003600>] decrementer_common+0x100/0x180
> >
> > I started blinking the LEDs on decrementer interupts, which do arrive on both
> > CPU threads.
>
> Hrm, ok I though the log shows the decrementer interrupt of the thread
> that's still working. If you are confident they are both taking
> interrupts, then there's indeed something to track down.
>
> > However, I'm a bit puzzled by these `hardirqs last enabled/disabled' messages,
> > as they do indicate interrupts are off...
>
> Well, at the time of the sample, the other CPU indeed -seems- to be in
> an IRQ disabled section yes.
This is not really a sample. The hardirqs enable/disable is actually tracked
using the TRACE_{EN,DIS}ABLE_INTS macros.
For the decrementer, the interrupt code is generated by the
STD_EXCEPTION_COMMON_LITE() macro.
Aha, none of the PPC interrupt handlers actually us TRACE_ENABLE_INTS (they do
use TRACE_DISABLE_INTS). So that's why it thinks decrementer_common disabled
interrupts, without enabling them again...
With kind regards,
Geert Uytterhoeven
Software Architect
Sony Techsoft Centre Europe
The Corporate Village · Da Vincilaan 7-D1 · B-1935 Zaventem · Belgium
Phone: +32 (0)2 700 8453
Fax: +32 (0)2 700 8622
E-mail: Geert.Uytterhoeven at sonycom.com
Internet: http://www.sony-europe.com/
A division of Sony Europe (Belgium) N.V.
VAT BE 0413.825.160 · RPR Brussels
Fortis · BIC GEBABEBB · IBAN BE41293037680010
More information about the Linuxppc-dev
mailing list