G5 Xserve rackmeter broken?

Aaro Koskinen aaro.koskinen at iki.fi
Thu May 14 21:48:12 AEST 2015


Hi,

On Thu, May 14, 2015 at 08:14:57PM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2015-05-14 at 13:06 +0300, Aaro Koskinen wrote:
> > On Wed, May 13, 2015 at 06:39:40AM +1000, Benjamin Herrenschmidt wrote:
> > > On Tue, 2015-05-12 at 20:55 +0300, Aaro Koskinen wrote:
> > > > I'm running with HZ=100 so the values are still probably within
> > > > jiffy resolution, so perhaps the calculation should first do
> > > > idle = min(idle, total)?
> > > 
> > > Does it gives you a reasonable output if you do that ?
> > 
> > The below change fixes the idle system blinking behaviour.
> > 
> > I'm also able to reproduce the leds going off during full CPU load case.
> > It seems there is some DMA error. Normally, reading rm->dma_regs->status
> > in the IRQ handler gives 0x8400. In the failure cases I've seen values
> > 0x8880 and 0x8980 - the IRQ will stop after this and it will need
> > paused -> started cycle before it gets going again (but sometimes fails
> > again soon after).
> 
> That's a bit worrysome, is that new ? Smells like faulting HW ...

Ok, right... I swapped the PSU and HD into a different box, and now it
seems to work as expected! (At least the first hour into GCC bootstrap
is still going fine...)

A.


More information about the Linuxppc-dev mailing list