Memory Corruption in Linux kernel MPC8347 revision 3

Bhupender Saharan bhupi.saharan at gmail.com
Thu Jul 19 07:14:00 EST 2007


Hi Boris,

When you are running the memory test make sure Data cahe and Instruction
caches are enabled.

Also check your BAT setting, there also Cache enable BIT shall be set.

As the burst transcation will happen only when cache is enabled.

How abt ECC...?

Bhupi



On 7/17/07, Boris Shteinbock <borish at batm.co.il> wrote:
>
>          Hi Everyone.
> I am working on the Linux port for MPC8347 revision 3 custom build board
> with DDR2 memory.
>
> I've successfully ported U-boot (latest git) and the kernel itself,
> however during kernel boot I am encountering serious memory corruption
> errors. The log for one of the examples is at the bottom of this
> message.
>
> Basically, the corruption is always happening somewhere at memory
> management intensive tasks such as networking, JFFS2 mounting etc.
>
> As far as I can see, it is not related to some specific driver, because
> even it happens even at kernel configured at absolute minimum, ( console
> serial driver only and even without it)
> The place of the corruption depends on kernel configuration.
>
> The DDR2 memory controller is configured correctly as far as I can tell,
> since :
> 1. DDR2 controller register values are taken from VxWorks bootrom
> that works on this board without any problems.
> 2. u-boot mtest passes successfully
> 3. u-boot alternative mtest passes successfully
> 4. My own custom mem tests in u-boot pass successfully
> 5. If I manage two boot the board into shell prompt (with absolute
> minimum configuration) memtester application is also successful.
>
> The minimum configuration that is one I am able to boot into shell is a
> kernel configured with serial console and small busybox JFFS2 file
> system in the flash. In this configuration, the boot fails the first
> time JFFS2 root FS is mounted. However it does boot after reset.
>
> I've tried different kernels with the same results  starting from, I
> think, 2.6.16  up to 2.6.22
> I tried the kernel that is provided by Freescale for 834x reference
> boards. ( with my board support of course)
>
> I tried booting both OF flat trees (powerpc) and bd_t based builds (ppc)
>
> I've also tried all memory management options :
> SLAB, SLOB and SLUB (in the latest kernel). They all failed at some
> point of time, so the assumption is that the problem is not in the
> memory management facilities.
>
> The board manufacturer swears that DDR2 memory controller values are
> correct and should work perfectly.
>
> So now I almost out of options and I am seeking your help.
> Any type of input on this issue would be greatly appreciated.
>
> Thanks,
> Boris
>
> PS. Note that an below example represents failure during DHCP
> autoconfiguration. However the similar error happens even when
> networking is disabled completely. just in a different place.
>
> => bootm
> ## Booting image at 00400000 ...
>    Image Name:   Linux-.6.21.5
>    Created:      2007-07-10  14:20:19 UTC
>    Image Type:   PowerPC Linux Kernel Image (gzip compressed)
>    Data Size:    898361 Bytes = 877.3 kB
>    Load Address: 00000000
>    Entry Point:  00000000
>    Verifying Checksum ... OK
>    Uncompressing Kernel Image ... OK
> ## Current stack ends at 0x07FA3CF8 => set upper limit to 0x00800000
> ## cmdline at 0x007FFF00 ... 0x007FFF41
> bd address  = 0x07FA3FBC
> memstart    = 0x00000000
> memsize     = 0x08000000
> flashstart  = 0xFE000000
> flashsize   = 0x02000000
> flashoffset = 0x00033000
> sramstart   = 0x00000000
> sramsize    = 0x00000000
> bootflags   = 0x00000001
> intfreq     =    528 MHz
> busfreq     =    264 MHz
> ethaddr     = 00:04:9F:EF:23:35
> eth1addr    = 00:E0:0C:00:7E:25
> IP addr     = 10.2.222.20
> baudrate    = 115200 bps
> No initrd
> ## Transferring control to Linux (at address 00000000) ...
> !!!! of_flat_tree = 00000000
> Booting without OF Flat tree
> Linux version .6.21.5 (me at localhost) (gcc version
> 4.0.0 (DENX ELDK 4.1 4.0.0)) #24 Tue Jul 10 17:20:09 IDT 2007
> Zone PFN ranges:
>   DMA             0 ->    32768
>   Normal      32768 ->    32768
> early_node_map[1] active PFN ranges
>     0:        0 ->    32768
> Built 1 zonelists.  Total pages: 32512
> Kernel command line: console=ttyS0,115200 root=/dev/mtdblock1
> rootfstype=jffs2 ip=dhcp
> IPIC (128 IRQ sources, 8 External IRQs) at fe000700
> PID hash table entries: 512 (order: 9, 2048 bytes)
> Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
> Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
> Memory: 127744k available (1584k kernel code, 444k data, 84k init, 0k
> highmem)
> Mount-cache hash table entries: 512
> NET: Registered protocol family 16
> Setup MTD partitions
> Generic PHY: Registered new driver
> NET: Registered protocol family 2
> IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
> TCP established hash table entries: 4096 (order: 3, 32768 bytes)
> TCP bind hash table entries: 4096 (order: 2, 16384 bytes)
> TCP: Hash tables configured (established 4096 bind 4096)
> TCP reno registered
> JFFS2 version 2.2. (NAND) (C) 2001-2006 Red Hat, Inc.
> io scheduler noop registered
> io scheduler anticipatory registered (default)
> io scheduler deadline registered
> io scheduler cfq registered
> Serial: 8250/16550 driver $Revision: 1.90 $ 2 ports, IRQ sharing
> disabled
> serial8250.0: ttyS0 at MMIO 0xe0004500 (irq = 9) is a 16550A
> serial8250.0: ttyS1 at MMIO 0xe0004600 (irq = 10) is a 16550A
> Gianfar MII Bus: probed
> eth0: Gianfar Ethernet Controller Version 1.2, 00:04:9f:ef:23:35
> eth0: Running with NAPI disabled
> eth0: 64/64 RX/TX BD ring size
> Broadcom BCM5241: Registered new driver
> physmap platform flash device: 02000000 at fe000000
> physmap-flash.0: Found 1 x16 devices at 0x0 in 8-bit bank
>  Amd/Fujitsu Extended Query Table at 0x0040
> physmap-flash.0: CFI does not contain boot bank location. Assuming top.
> number of CFI chips: 1
> cfi_cmdset_0002: Disabling erase-suspend-program due to code brokenness.
> cmdlinepart partition parsing not available
> RedBoot partition parsing not available
> Using physmap partition information
> Creating 2 MTD partitions on "physmap-flash.0":
> 0x00000000-0x00100000 : "uboot"
> 0x00100000-0x02000000 : "rootfs"
> IPv4 over IPv4 tunneling driver
> GRE over IPv4 tunneling driver
> TCP cubic registered
> NET: Registered protocol family 1
> NET: Registered protocol family 17
> !!!! Gianfar init_phy. phy_id = 0:01
> !!!! phy_attach. phy_id = 0:01
> !!!! phy_attach device found. phy_id = 0:01
> Sending DHCP requests .<3>slab: Internal list corruption detected in
> cache 'files_cache'(21), slabp c0344000(16). Hexdump:
>
> 000: 00 10 01 00 00 20 02 00 00 00 00 70 c0 34 40 70
> 010: 00 00 00 10 00 00 ff 10 00 00 00 00 ff ff ff fe
> 020: ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff fe
> 030: ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff fe
> 040: ff ff ff fe ff ff ff fe ff ff ff fe ff ff ff fe
> 050: ff ff ff fe ff ff ff fe ff ff ff fd 00 00 00 11
> 060: 00 00 00 12 00 00 00 13 00 00 00 14 ff ff ff ff
> ------------[ cut here ]------------
> kernel BUG at mm/slab.c:2936!
> Oops: Exception in kernel mode, sig: 5 [#1]
> NIP: C0050F80 LR: C0050F80 CTR: 00000000
> REGS: c034fe00 TRAP: 0700   Not tainted  (.6.21.5)
> MSR: 00021032 <ME,IR,DR>  CR: 24004002  XER: 00000000
> TASK = c032a3c0[3] 'events/0' THREAD: c034e000
> GPR00: C0050F80 C034FEB0 C032A3C0 00000001 00000DF5 FFFFFFFF C00F5B54
> 00000010
> GPR08: C01D0000 C01E0000 00000DF5 00000DF5 00000000 00F1C5DB 07FFD000
> FFFFFFFF
> GPR16: 00000001 00000000 00000000 00800000 00000000 007FFF00 00000000
> C033DAA0
> GPR24: C0339C90 00000007 00000000 C01B0000 C01B0000 C0344000 C033DAA0
> 00000070
> NIP [C0050F80] check_slabp+0xe4/0x11c
> LR [C0050F80] check_slabp+0xe4/0x11c
> Call Trace:
> [C034FEB0] [C0050F80] check_slabp+0xe4/0x11c (unreliable)
> [C034FED0] [C0051450] free_block+0x88/0x138
> [C034FF00] [C0052188] drain_array+0xa0/0xe0
> [C034FF20] [C0052228] cache_reap+0x60/0x144
> [C034FF40] [C00257B0] run_workqueue+0xd0/0x170
> [C034FF60] [C0025960] worker_thread+0x110/0x144
> [C034FFC0] [C00298E4] kthread+0x74/0xb0
> [C034FFF0] [C0005F38] kernel_thread+0x44/0x60
> Instruction dump:
> 387b6538 7c9df8ae 4bfc2b49 3bff0001 813e0020 5529103a 3929001c 7f9f4840
> 419cffcc 3c60c01b 3863336c 4bfc2b25 <0fe00000> 48000000 80030020
> 2f800000
>
> _______________________________________________
> Linuxppc-embedded mailing list
> Linuxppc-embedded at ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-embedded
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://ozlabs.org/pipermail/linuxppc-embedded/attachments/20070718/89d59c6f/attachment.htm 


More information about the Linuxppc-embedded mailing list