Sandpoint & random crashes?
Mark A. Greer
mgreer at mvista.com
Tue Sep 19 04:42:10 EST 2000
Hey Alex.
I now have a sandpoint that exhibits problems like you were seeing. That's good
because now I have some to test with.
In addition, I'm planning on a fairly major overhaul sometime "soon". The
problem is, I'm neck deep in other work right now.
I'll let you know when I have that work done. If you fix any problems, please
let the community know.
Thanks,
Mark
--
> Hi, Mark!
>
> On Mon, Sep 11, 2000 at 04:11:53PM -0700, you wrote the following:
>
> > > Today I think I noticed a very interesting consistency that might be
> > > helpful. I haven't had the time to test it completely; I'll do it
> > > tomorrow and post again. The thing is, there's a little green led on
> > > the board saying "backup power" or something like that. If you turn
> > > off the computer and the power supply, and leave it off for half a
> > > minute or so, the led turns off. If you turn the computer on
> > > afterwards and load the kernel, it loads init and you can work (until
> > > it crashes). If you just reset the computer and load the kernel (after
> > > uploading it via dink of course), init won't load.
> > >
> >
> > This almost sounds like a hardware problem. How old is your
> > processor module? Remember this is a test platform for MOT SPS
> > where they test out new processors, etc. They may have given you an
> > early rev board or processor or host bridge or... If you have an
> > old one, you may want to ask for a newer one.
>
> We've just bought those boards now from Motorola. I checked on their
> site and we have the latest revisions...
>
> As to hardware problems, I took the other box we have here (an
> identical configuration) and tested there, with the same results. :-(
> So if it's a hardware problem, it's in all those boards. I also ran
> the memory test that dink has on all the memory that I can (from 90000
> to the end; before that resides dink itself) and it didn't find any
> errors. (I ran all the six or seven tests, 19 times in a row -- about
> 19-20 hours of testing.) So unless I'm extremely unlucky (and there
> are problems in the low range), the memory isn't a problem either.
>
> I downloaded the compilers from CDK 1.2 and compiled a kernel with
> them. Made no difference.
>
> Here are some more crash dumps, FWIW. Something is definitely fishy in
> regard to memory management.
>
> This one is weird -- I don't have any swap.
>
> > mount-t^H ^H^H^H
>
> sh: mount-: coBad swap file entry 00000085
> kernel BUG at swap_state.c:71!
>
> NIP: C002F0D4 XER: 20000000 LR: C002F0D4 REGS: c0283ce0 TRAP: 0700
> MSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
> TASK = c0282000[7] 'sh' Last syscall: 1
> last math c0282000 last altivec 00000000
> GPR00: C002F0D4 C0283D90 C0282000 0000001F 00001032 C0105734 C0140000 00000000
> GPR08: C0140000 C0100000 C0110000 C0283CD0 44262824 100A09F8 00000000 100A6190
> GPR16: 00000000 00000000 00000000 00000000 C01310E0 C0284100 1024D000 0FF2E000
> GPR24: 00000000 00000000 0032E000 00000041 C03CDB4C C0284100 00000085 C01AF6B8
> Call backtrace:
> C002F0D4 C002F1EC C002F328 C00208B8 C002393C C0013F84 C0016FB0
> C00171F4 C0004CC0 0FE79968 1002190C 10020DC8 1001D75C 1004D09C
> 10010B08 1000FBC4 0FE6F75C 00000000
> Kernel panic: Exception in kernel pc c002f0d4 signal 4
>
> backtrace:
> 0xc002f0d4 -- 0xc002f080 + 0x0054 __delete_from_swap_cache
> 0xc002f1ec -- 0xc002f14c + 0x00a0 delete_from_swap_cache_nolock
> 0xc002f328 -- 0xc002f280 + 0x00a8 free_page_and_swap_cache
> 0xc00208b8 -- 0xc0020730 + 0x0188 zap_page_range
> 0xc002393c -- 0xc0023838 + 0x0104 exit_mmap
> 0xc0013f84 -- 0xc0013f4c + 0x0038 mmput
> 0xc0016fb0 -- 0xc0016ed0 + 0x00e0 do_exit
> 0xc00171f4 -- 0xc00171f4 + 0x0000 sys_wait4
> 0xc0004cc0 -- 0xc0004cc0 + 0x0000 ret_from_syscall_1
>
> And this one is crazy -- 14,500,000 worked fine, 15,000,000 gave me
> "Out of memory", and the middle between them gave me this:
>
> bash-2.03# perl -e '$a="A"x14750000'
> kmem_free: Bad obj addr (objp=c0177500, name=size-64)
> kernel BUG at slab.c:1695!
> NIP: C002CDD4 XER: 20000000 LR: C002CDD4 REGS: c0104cd0 TRAP: 0700
> MSR: 00089032 EE: 1 PR: 0 FP: 0 ME: 1 IR/DR: 11
> TASK = c0103000[0] 'swapper' Last syscall: 36
> last math 00000000 last altivec 00000000
> GPR00: C002CDD4 C0104D80 C0103000 0000001B 00001032 C0105734 C0140000 00000000
> GPR08: C0140000 C0100000 C0110000 C0104CC0 24462024 100A09F8 00000000 00000000
> GPR16: 00000000 00000000 00000000 00000000 00000000 00104EB0 C02F7042 C02FFA40
> GPR24: C0177500 00000014 0000001C C01023E0 C017755C C0177FE0 C0177500 C01A0160
> Call backtrace:
> C002CDD4 C009E1E4 C009EA68 C009DA98 C009DF70 C0093E4C C00189EC
> C0004F60 00000000 C0006130 C0006144 C011678C 00003C60
> Kernel panic: Exception in kernel pc c002cdd4 signal 4
> In interrupt handler - not syncing
> Rebooting in 180 seconds..
>
> backtrace:
> 0xc002cdd4 -- 0xc002ca04 + 0x03d0 kfree
> 0xc009e1e4 -- 0xc009e0e4 + 0x0100 ip_free
> 0xc009ea68 -- 0xc009e740 + 0x0328 ip_defrag
> 0xc009da98 -- 0xc009da70 + 0x0028 ip_local_deliver
> 0xc009df70 -- 0xc009dc44 + 0x032c ip_rcv
> 0xc0093e4c -- 0xc0093c48 + 0x0204 net_rx_action
> 0xc00189ec -- 0xc0018934 + 0x00b8 do_softirq
> 0xc0004f60 -- 0xc0004f60 + 0x0000 do_bottom_half_ret
> 0x00000000 -- unknown address
> 0xc0006130 -- 0xc00060c0 + 0x0070 idled
> 0xc0006144 -- 0xc0006134 + 0x0010 cpu_idle
> 0xc011678c -- 0xc0116644 + 0x0148 start_kernel
> 0x00003c60 -- unknown address
>
> --
> Alex Shnitman | http://www.debian.org
> alexsh at hectic.net, alexsh at linux.org.il +-----------------------
> http://alexsh.hectic.net UIN 188956 PGP key on web page
> E1 F2 7B 6C A0 31 80 28 63 B8 02 BA 65 C7 8B BA
>
> /real/ kernel hackers
> dd if=/dev/urandom of=/vmlinuz
> and influence the Universal Randomosity Field.
> -- Gaal Yahas
--
Mark A. Greer (mgreer at mvista.com; 480-517-0287)
MontaVista Software, Inc.
2141 E. Broadway Road, Suite 108
Tempe, AZ 85282
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
More information about the Linuxppc-embedded
mailing list