udev faulting with openbmc filesystem builds.

Joel Stanley joel at jms.id.au
Tue Feb 13 12:51:59 AEDT 2018


On Tue, Feb 13, 2018 at 4:44 AM, Steven J. Hill <Steven.Hill at cavium.com> wrote:
> On 02/12/2018 12:44 AM, Joel Stanley wrote:
>>
>> Do your other filesystems have similar userspace components, such as systemd?
>>
>> Have you tried to reproduce using the OpenBMC kernel?
>>
> Joel,
>
> Yes, I have tried the OpenBMC kernel and get the same results. Let me outline
> my platform, kernel versions tried, etc. for completeness.
>
> Developing on Aspeed AST2500 evaluation board. The core is an AST2500-A1.
> The detailed core information from the kernel is:
>
>    CPU: ARMv6-compatible processor [410fb767] revision 7 (ARMv7), cr=00c5307d
>    CPU: PIPT / VIPT nonaliasing data cache, VIPT nonaliasing instruction cache
>    OF: fdt: Machine model: AST2500 EVB
>
> Kernels that I have tested:
>
>    Palmetto (built by the OBMC build system)
>    Witherspoon (built by the OBMC build system)
>    Generic OpenBMC kernel from 'dev-4.10' branch

When you say "generic", which device tree are you using?

I don't know what has gone wrong from your backtrace. Which device
driver was being loaded when you hit this issue? Booting with
initcall_debug might help here.

I suggest you use the device tree for the ast2500-evb. Here's how I
would build a kernel for testing:

git checkout dev-4.13
make CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm aspeed_g5_defconfig
make CROSS_COMPILE=arm-linux-gnueabi- ARCH=arm

cat << EOF > evb.its
/dts-v1/;

/ {
        description = "test kernel";
        #address-cells = <1>;

        images {
                kernel at 1 {
                        description = "Linux kernel";
                        data = /incbin/("arch/arm/boot/zImage");
                        type = "kernel";
                        arch = "arm";
                        os = "linux";
                        compression = "none";
                        load = <0x80001000>;
                        entry = <0x80001000>;
                };
                fdt at 1 {
                        description = "device tree";
                        data =
/incbin/("arch/arm/boot/dts/aspeed-ast2500-evb.dtb");
                        type = "flat_dt";
                        arch = "arm";
                        compression = "none";
                };
                ramdisk at 1 {
                        description = "initramfs";
                        data = /incbin/("rootfs.cpio.xz");
                        type = "ramdisk";
                        arch = "arm";
                        os = "linux";
                };
        };

        configurations {
                default = "conf at 1";
                conf at 1 {
                        description = "Boot Linux kernel with FDT
blob, ramdisk";
                        kernel = "kernel at 1";
                        fdt = "fdt at 1";
                        ramdisk = "ramdisk at 1";
                };
        };
};
EOF

mkimage -f evb.its evb
cp evb /srv/tftp/

>From u-boot:

setenv serverip <tftp server ip>
dhcp evb
bootm

If that shows the bug, send me the fill dmesg log with initcall_debug
enabled and we can go from there.

Cheers,

Joel

>    Aspeed SDK kernel based on 4.9 (supplied from Aspeed)
>    Aspeed SDK kernel based on 4.10 (ported by me)
>    Aspeed SDK kernel based on 4.11 (ported by me)
>    Aspeed SDK kernel based on 4.12 (ported by me)
>    Aspeed SDK kernel based on 4.13 (ported by me)
>    Aspeed SDK kernel based on 4.14 (ported by me)
>    Aspeed SDK kernel based on 4.15 (ported by me)
>
> Root filesystems tested, all using systemd:
>
>    Palmetto
>    Witherspoon
>    buildroot (HEAD)
>
> All kernel/RFS combinations fault at exactly the same function inside
> of 'udevadm' shown below. It is always the 'strlen' function. For all
> test runs the virtual address is always aligned on a 4KB page boundary.
> I have tried variations of SLUB, SLAB, enabling errata workarounds,
> passing different 'mem=xxx' to the kernel, kernel hacking options,
> "kernel mem{cpy,set}() for {copy_to,clear}_user()" option and others.
> This is where I am currently. Any insight would be great. Cheers.
>
> Steve
>
>
>
> Unable to handle kernel paging request at virtual address 9bc8c000
> pgd = db438000
> [9bc8c000] *pgd=00000000
> Internal error: Oops: 5 [#1] ARM
> Modules linked in:
> CPU: 0 PID: 900 Comm: udevadm Not tainted 4.10.0+ #3
> Hardware name: ASpeed BMC SoC
> task: dbed7880 task.stack: db4ae000
> PC is at strlen+0xc/0x38
> LR is at kstrdup+0x20/0x58
> pc : [<c01f610c>]    lr : [<c00965ec>]    psr: a0000013
> sp : db4afd30  ip : db4afd40  fp : db4afd3c
> r10: db5a88c0  r9 : dbda1f08  r8 : 00000fff
> r7 : db4afd9a  r6 : 9bc8c000  r5 : c0269d9c  r4 : 014000c0
> r3 : dbdbf990  r2 : 0000f000  r1 : 014000c0  r0 : 9bc8c000
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
> Control: 00c5387d  Table: 9b438008  DAC: 00000055
> Process udevadm (pid: 900, stack limit = 0xdb4ae188)
> Stack: (0xdb4afd30 to 0xdb4b0000)
> ...
> ...
> ...
> Backtrace:
> [<c01f6100>] (strlen) from [<c00965ec>] (kstrdup+0x20/0x58)
> [<c00965cc>] (kstrdup) from [<c0269d9c>] (misc_devnode+0x38/0x40)
>  r7:db4afd9a r6:dbda1f00 r5:db4afd9c r4:dbda1f08
> [<c0269d64>] (misc_devnode) from [<c026e7b0>] (device_get_devnode+0x78/0xdc)
> [<c026e738>] (device_get_devnode) from [<c026e950>] (dev_uevent+0x13c/0x1e0)


More information about the openbmc mailing list