Device Tree Corrupted after unflatten_device_tree()

Lixin Yao Lixin.Yao at HSTX.com
Thu Oct 22 04:43:55 EST 2009


When corrupted, curtain blocks of 64 bytes are messed up.
This is a screen dump of a good unflattened device at beginning:

NCCv2>md 0x3ffdd40
03ffdd40 : c3ffddd4 c025a8dc 00000000 00000000  .....%..........
03ffdd50 : c3ffdd80 c3ffdd84 00000000 00000000  ................
03ffdd60 : c3ffddd8 00000000 c3fffe94 c3ffddd8  ................
03ffdd70 : 00000000 00000001 00000000 00000000  ................
03ffdd80 : 2f000000 c07306b4 00000010 c072f34c  /....s.......r.L
03ffdd90 : c3ffdd94 c07306ba 00000010 c072f368  .....s.......r.h
03ffdda0 : c3ffdda4 c07306c5 00000004 c072f384  .....s.......r..
03ffddb0 : c3ffddb4 c07306d4 00000004 c072f394  .....s.......r..
03ffddc0 : c3ffddc4 c025ed98 00000001 c3ffddd4  .....%..........
03ffddd0 : 00000000 00000000 c3ffde50 c025a8dc  ...........P.%..
03ffdde0 : 00000000 00000000 c3ffde18 c3ffde20  ............... 
03ffddf0 : 00000000 c3ffdd40 c3ffde58 c3ffdf78  ....... at ...X...x
03ffde00 : c3ffde58 c3ffde58 00000000 00000001  ...X...X........
03ffde10 : 00000000 00000000 2f637075 73000000  ......../cpus...
03ffde20 : c07306c5 00000004 c072f3b0 c3ffde30  .s.......r.....0
03ffde30 : c07306d4 00000004 c072f3c0 c3ffde40  .s.......r.....@
NCCv2>md
03ffde40 : c025ed98 00000005 c3ffde50 00000000  .%.........P....
03ffde50 : 63707573 00000000 c3ffdf6c c072f3e4  cpus.......l.r..
03ffde60 : 00000000 00000000 c3ffde98 c3ffdeac  ................
03ffde70 : 00000000 c3ffddd8 00000000 00000000  ................
03ffde80 : 00000000 c3ffdf78 00000000 00000002  .......x........
03ffde90 : 00000000 00000000 2f637075 732f506f  ......../cpus/Po
03ffdea0 : 77657250 432c3836 36403000 c07306e0  werPC,866 at 0..s..
03ffdeb0 : 00000004 c072f3e4 c3ffdebc c07306ec  .....r.......s..
03ffdec0 : 00000004 c072f3f4 c3ffdecc c07306f0  .....r.......s..
03ffded0 : 00000004 c072f404 c3ffdedc c0730702  .....r.......s..
03ffdee0 : 00000004 c072f414 c3ffdeec c0730714  .....r.......s..
03ffdef0 : 00000004 c072f424 c3ffdefc c0730721  .....r.$.....s.!
03ffdf00 : 00000004 c072f434 c3ffdf0c c073072e  .....r.4.....s..
03ffdf10 : 00000004 c072f444 c3ffdf1c c0730741  .....r.D.....s.A
03ffdf20 : 00000004 c072f454 c3ffdf2c c073074f  .....r.T...,.s.O
03ffdf30 : 00000004 c072f464 c3ffdf3c c073075f  .....r.d...<.s._
NCCv2>md
03ffdf40 : 00000008 c072f474 c3ffdf4c c073076a  .....r.t...L.s.j
03ffdf50 : 00000004 c072f488 c3ffdf5c c025ed98  .....r.....\.%..
03ffdf60 : 0000000c c3ffdf6c 00000000 506f7765  .......l....Powe
03ffdf70 : 7250432c 38363600 c3ffe01c c072f4d8  rPC,866......r..
03ffdf80 : 00000000 00000000 c3ffdfb8 c3ffdfcc  ................
03ffdf90 : 00000000 c3ffdd40 00000000 c3ffe030  ....... at .......0
03ffdfa0 : 00000000 c3ffe030 00000000 00000001  .......0........
03ffdfb0 : 00000000 00000000 2f65636c 69707365  ......../eclipse
03ffdfc0 : 5f737065 63696669 6300bbe0 c07306c5  _specific....s..
03ffdfd0 : 00000004 c072f4b8 c3ffdfdc c07306d4  .....r.......s..
03ffdfe0 : 00000004 c072f4c8 c3ffdfec c07306e0  .....r.......s..
03ffdff0 : 00000015 c072f4d8 c3ffdffc c073077b  .....r.......s.{
03ffe000 : 00000004 c072f4fc c3ffe00c c025ed98  .....r.......%..
03ffe010 : 00000011 c3ffe01c 00000000 65636c69  ............ecli
03ffe020 : 7073655f 73706563 69666963 00ef3980  pse_specific..9.
03ffe030 : c3ffe0e4 c025a8dc 00000000 00000000  .....%..........

When corrupted, it becomes following, note the 64 bock at 0x03ffdf00
is messed up. And this kind of corruptions occur several times
in the unflattened device tree. They are properties of nodes.

NCCv2>md 0x3ffdd40
03ffdd40 : c3ffddd4 c025a8dc 00000000 00000000  .....%..........
03ffdd50 : c3ffdd80 c3ffdd84 00000000 00000000  ................
03ffdd60 : c3ffddd8 00000000 c3fffe94 c3ffddd8  ................
03ffdd70 : 00000000 00000001 00000000 00000000  ................
03ffdd80 : 2f000000 c07306b4 00000010 c072f34c  /....s.......r.L
03ffdd90 : c3ffdd94 c07306ba 00000010 c072f368  .....s.......r.h
03ffdda0 : c3ffdda4 c07306c5 00000004 c072f384  .....s.......r..
03ffddb0 : c3ffddb4 c07306d4 00000004 c072f394  .....s.......r..
03ffddc0 : c3ffddc4 c025ed98 00000001 c3ffddd4  .....%..........
03ffddd0 : 00000000 00000000 c3ffde50 c025a8dc  ...........P.%..
03ffdde0 : 00000000 00000000 c3ffde18 c3ffde20  ............... 
03ffddf0 : 00000000 c3ffdd40 c3ffde58 c3ffdf78  ....... at ...X...x
03ffde00 : c3ffde58 c3ffde58 00000000 00000001  ...X...X........
03ffde10 : 00000000 00000000 2f637075 73000000  ......../cpus...
03ffde20 : c07306c5 00000004 c072f3b0 c3ffde30  .s.......r.....0
03ffde30 : c07306d4 00000004 c072f3c0 c3ffde40  .s.......r.....@
NCCv2>md
03ffde40 : c025ed98 00000005 c3ffde50 00000000  .%.........P....
03ffde50 : 63707573 00000000 c3ffdf6c c072f3e4  cpus.......l.r..
03ffde60 : 00000000 00000000 c3ffde98 c3ffdeac  ................
03ffde70 : 00000000 c3ffddd8 00000000 00000000  ................
03ffde80 : 00000000 c3ffdf78 00000000 00000001  .......x........
03ffde90 : 00000000 00000000 2f637075 732f506f  ......../cpus/Po
03ffdea0 : 77657250 432c3836 36403000 c07306e0  werPC,866 at 0..s..
03ffdeb0 : 00000004 c072f3e4 c3ffdebc c07306ec  .....r.......s..
03ffdec0 : 00000004 c072f3f4 c3ffdecc c07306f0  .....r.......s..
03ffded0 : 00000004 c072f404 c3ffdedc c0730702  .....r.......s..
03ffdee0 : 00000004 c072f414 c3ffdeec c0730714  .....r.......s..
03ffdef0 : 00000004 c072f424 c3ffdefc c0730721  .....r.$.....s.!
03ffdf00 : ffffffff ffff000c db055be0 08060001  ..........[.....
03ffdf10 : 08000604 0001000c db055be0 ac141001  ..........[.....
03ffdf20 : 00000000 0000ac14 10530000 10530000  .........S...S..
03ffdf30 : 08000604 0001000c 36681bfe f874c01e  ........6h...t..
NCCv2>md
03ffdf40 : 00000008 c072f474 c3ffdf4c c073076a  .....r.t...L.s.j
03ffdf50 : 00000004 c072f488 c3ffdf5c c025ed98  .....r.....\.%..
03ffdf60 : 0000000c c3ffdf6c 00000000 506f7765  .......l....Powe
03ffdf70 : 7250432c 38363600 c3ffe01c c072f4d8  rPC,866......r..
03ffdf80 : 00000000 00000000 c3ffdfb8 c3ffdfcc  ................
03ffdf90 : 00000000 c3ffdd40 00000000 c3ffe030  ....... at .......0
03ffdfa0 : 00000000 c3ffe030 00000000 00000001  .......0........
03ffdfb0 : 00000000 00000000 2f65636c 69707365  ......../eclipse
03ffdfc0 : 5f737065 63696669 63002870 c07306c5  _specific.(p.s..
03ffdfd0 : 00000004 c072f4b8 c3ffdfdc c07306d4  .....r.......s..
03ffdfe0 : 00000004 c072f4c8 c3ffdfec c07306e0  .....r.......s..
03ffdff0 : 00000015 c072f4d8 c3ffdffc c073077b  .....r.......s.{
03ffe000 : 00000004 c072f4fc c3ffe00c c025ed98  .....r.......%..
03ffe010 : 00000011 c3ffe01c 00000000 65636c69  ............ecli
03ffe020 : 7073655f 73706563 69666963 007be4fa  pse_specific.{..
03ffe030 : c3ffe0e4 c025a8dc 00000000 00000000  .....%..........

Also, I found 
mem = lmb_alloc(size + 4, __alignof__(struct device_node));
in unflatten_device_tree() is not really aligned at 64 byte boundary,
although sizeof(struct device_node) is 64 bytes. This bothers me. I suppose
this part of code is used many many times and on many many boards. It should
work. I am not sure if this causes the problem.

Thanks!

Lixin

-----Original Message-----
From: Michael Ellerman [mailto:michael at ellerman.id.au] 
Sent: Tuesday, October 20, 2009 7:02 PM
To: Lixin Yao
Cc: linuxppc-dev at lists.ozlabs.org
Subject: Re: Device Tree Corrupted after unflatten_device_tree()

On Tue, 2009-10-20 at 09:10 -0700, Lixin Yao wrote:
> I use a board with MPC866T and 2.6.28 Linux Kernel.  Occasionally, the
> unflattened device is corrupted after “unflatten_device_tree()” which
> causes crash of kernel when device tree is traversed later on.
> 
> I looked at the fixes in lib/lmb.c, arch/powerpc/mm,
> arch/powerpc/kernel etc since 2.6.28 to 2.6.32-r4 (the most recent
> version) and could not fix my problem.
> 
> I have had a hard time trying to determine the cause. 
> 
> arch/powerpc/kernel/setup_32.c
> 
> void __init setup_arch(char **cmdline_p)
> 
> {
> 
>         *cmdline_p = cmd_line;
> 
>         /* so udelay does something sensible, assume <= 1000 bogomips
> */
> 
>         loops_per_jiffy = 500000000 / HZ;
> 
>         unflatten_device_tree();
> 
>         /* UNFLATTENED DEVICE TREE IS CORRUPTED SOMETIMES HERE */

_In what way_ is it corrupted? Bad tree structure? Bogus node/property
values, names etc.

cheers


More information about the Linuxppc-dev mailing list