[PROBLEM] Soft lockup on Linux 2.6.27, 2 patches, Cell/PPC64

Benjamin Herrenschmidt benh at kernel.crashing.org
Wed Oct 15 22:49:52 EST 2008


On Wed, 2008-10-15 at 13:46 +0200, Geert Uytterhoeven wrote:
> On Wed, 15 Oct 2008, Benjamin Herrenschmidt wrote:
> > > > Well, at the time of the sample, the other CPU indeed -seems- to be in
> > > > an IRQ disabled section yes. 
> > > 
> > > This is not really a sample. The hardirqs enable/disable is actually tracked
> > > using the TRACE_{EN,DIS}ABLE_INTS macros.
> > 
> > That's what I meant. IE. the hardirq state was updated by the stuck CPU
> > but sampled by the non-stuck one. ie. the non-stuck one could have
> > sampled a transcient value where it happened to have hard irq
> > disabled...
> 
> These states are per_cpu.

I know, but that doesn't prevent another CPU from peeking at them :-)
The question is, was the message printed by the CPU that locked up or by
the other one that detected the lockup ?

> They do call TRACE_DISABLE_INTS, which records the interrupt being disabled.
> So this makes the actual state recording useless...

Well, they record that when they disable it. They don't enable it. Can
you find a spot where the IRQ is enabled and it's not recorded or a case
where it's not disabled and recorded as disabled ?

Cheers,
Ben.





More information about the Linuxppc-dev mailing list