Random crashes
Giuliano Pochini
pochini at shiny.it
Thu Aug 28 23:25:09 EST 2003
On 28-Aug-2003 Benjamin Herrenschmidt wrote:
>> > Strange. I haven't been reported such problems. Can you try an older kernel
>> > just in case ? Could also be bad ram...
>>
>> I tried 2.4.22 and I replaced the RAM. Nothing. Digging in the oops
>> collection I found this one which doesn't look very nice:
>>
>> Jul 23 21:37:55 localhost kernel: Machine check in kernel mode.
>> Jul 23 21:37:55 localhost kernel: Caused by (from SRR1=20009030): L1 Data Cache error
>>
>> I'll send the machine back for repair, altought I think they'll not even
>> notice the problem because it happens sporadically :(((
>
> Well... I'm not 100% sure the message is correct, though from what you say,
> it seems indeed there is a CPU fault...
Yes, but it happened only once. All the others were "normal" segfaults, in both
userspace and kernel space and hard lockups.
> What CPU is this exactly ? (/proc/cpuinfo)
processor : 0
cpu : 7455, altivec supported
clock : 1249MHz
revision : 3.3 (pvr 8001 0303)
bogomips : 1248.46
processor : 1
cpu : 7455, altivec supported
clock : 1249MHz
revision : 3.3 (pvr 8001 0303)
bogomips : 1248.46
total bogomips : 2496.92
machine : PowerMac3,6
motherboard : PowerMac3,6 MacRISC3 Power Macintosh
detected as : 129 (PowerMac G4 Windtunnel)
pmac flags : 00000000
L2 cache : 256K unified
memory : 512MB
pmac-generation : NewWorld
I'm reading the latest 7455 errata
http://e-www.motorola.com/files/32bit/doc/errata/MPC7455CE.pdf
but I don't see anything that can cause L1 errors.
Unrelated thing: tlbli instruction can cause problems on 7455 (bug.20).
arch/ppc/kernel/head.S does not use the suggested workaround.
Bye.
Giuliano.
** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/
More information about the Linuxppc-dev
mailing list