Linux Boot failure : WARNING at arch/powerpc/kernel/eeh.c:357

Sachin Sant sachinp at linux.vnet.ibm.com
Wed Jul 10 18:24:23 AEST 2019


Recent linuxppc (merge branch) builds fail to boot on POWER9 BMC box.
Following warning is printed continuously on the console:

Last successful boot was with commit 3661376f60.

[    4.101480] Console: switching to colour dummy device 80x25
[    4.101550] tg3 0005:01:00.1: enabling device (0140 -> 0142)
[    4.102249] nouveau 0004:04:00.0: enabling device (0140 -> 0142)
[    4.102305] nouveau 0004:04:00.0: NVIDIA GV100 (140000a1)
[    4.106342] WARNING: CPU: 0 PID: 2487 at arch/powerpc/kernel/eeh.c:357 eeh_check_failure+0x68/0xf0
[    4.106364] Modules linked in: nouveau(+) ast(+) i2c_algo_bit drm_kms_helper ahci syscopyarea sysfillrect libahci sysimgblt fb_sys_fops ttm libata drm mlx5_core(+) drm_panel_orientation_quirks tg3(+) ptp nvme pps_core nvme_core
[    4.106426] CPU: 0 PID: 2487 Comm: kworker/0:4 Not tainted 5.2.0-rc6+ #1
[    4.106454] Workqueue: events work_for_cpu_fn
[    4.106475] NIP:  c000000000043f08 LR: c000000000043eec CTR: c0000000005c2430
[    4.106540] REGS: c000003fd78b72b0 TRAP: 0700   Not tainted  (5.2.0-rc6+)
[    4.106577] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 84444028  XER: 20040000
[    4.106615] CFAR: c000000000074738 IRQMASK: 0 
[    4.106615] GPR00: 00000000ae012000 c000003fd78b7540 c00000000144a200 c000203fff654348 
[    4.106615] GPR04: c00a00008d301e00 0000000000000000 c000003fd78b7564 0000000000000015 
[    4.106615] GPR08: c000203fff654348 0000000000000001 0000000000000001 4000000000000000 
[    4.106615] GPR12: 0000000000000005 c000000001890000 c0000000001510b8 c000003fefca4ec0 
[    4.106615] GPR16: 0000000400040000 0000000000000000 0000000000000001 0000000000000001 
[    4.106615] GPR20: c000003febbd50b0 0000000000000002 0000000000000000 0000000040000000 
[    4.106615] GPR24: 0000000040000000 0000000040000000 000000000000d600 c000003fc7c77000 
[    4.106615] GPR28: c000003fe6b32a80 000000000000e600 c000003fa9990000 c00a00008d301e00 
[    4.106870] NIP [c000000000043f08] eeh_check_failure+0x68/0xf0
[    4.106898] LR [c000000000043eec] eeh_check_failure+0x4c/0xf0
[    4.106933] Call Trace:
[    4.106942] [c000003fd78b7540] [0000000000000001] 0x1 (unreliable)
[    4.106957] [c000003fd78b7580] [c0000000005c2530] ioread32+0x100/0x170
[    4.107033] [c000003fd78b75b0] [c00800000f0b7008] prom_read+0x78/0xf0 [nouveau]
[    4.107112] [c000003fd78b7600] [c00800000f0b5bb0] shadow_fetch.isra.1+0x90/0x120 [nouveau]
[    4.107191] [c000003fd78b7650] [c00800000f0b5cf8] shadow_image+0xb8/0x360 [nouveau]
[    4.107276] [c000003fd78b7700] [c00800000f0b6040] shadow_method+0xa0/0x1a0 [nouveau]
[    4.107353] [c000003fd78b7780] [c00800000f0b6338] nvbios_shadow+0x1f8/0x3f0 [nouveau]
[    4.107434] [c000003fd78b7910] [c00800000f0a23e8] nvkm_bios_new+0x98/0x530 [nouveau]
[    4.107505] [c000003fd78b79b0] [c00800000f134810] nvkm_device_ctor+0x1610/0x4120 [nouveau]
[    4.107577] [c000003fd78b7aa0] [c00800000f137cb0] nvkm_device_pci_new+0x180/0x3d0 [nouveau]
[    4.107659] [c000003fd78b7b80] [c00800000f1a13a0] nouveau_drm_probe+0x220/0x360 [nouveau]
[    4.107697] [c000003fd78b7bd0] [c0000000006112ec] local_pci_probe+0x6c/0x100
[    4.107732] [c000003fd78b7c50] [c0000000001430f8] work_for_cpu_fn+0x38/0x60
[    4.107776] [c000003fd78b7c80] [c000000000148118] process_one_work+0x1c8/0x4a0
[    4.107821] [c000003fd78b7d20] [c000000000148668] worker_thread+0x278/0x570
[    4.107866] [c000003fd78b7db0] [c000000000151210] kthread+0x160/0x1a0
[    4.107897] [c000003fd78b7e20] [c00000000000ba54] ret_from_kernel_thread+0x5c/0x68
[    4.107931] Instruction dump:
[    4.107962] e8690040 e92d1178 f9210028 39200000 48030631 60000000 2c230000 4182008c 
[    4.108001] 81410024 7d4a0034 554ad97e 694a0001 <0b0a0000> 7d201c28 7bff0420 79295d24 
[    4.108042] ---[ end trace 2a6f56e64fb66de9 ]—

WARN message is being printed by following code snippet :

static inline unsigned long eeh_token_to_phys(unsigned long token)
{
        pte_t *ptep;
        unsigned long pa;
        int hugepage_shift;

        /*
         * We won't find hugepages here(this is iomem). Hence we are not
         * worried about _PAGE_SPLITTING/collapse. Also we will not hit
         * page table free, because of init_mm.
         */
        ptep = find_init_mm_pte(token, &hugepage_shift);
        if (!ptep)
                return token;
        WARN_ON(hugepage_shift);  <======
        pa = pte_pfn(*ptep) << PAGE_SHIFT;

        return pa | (token & (PAGE_SIZE-1));
}

Boot log attached.

Thanks
-Sachin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: boot.log
Type: application/octet-stream
Size: 615724 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20190710/a8712d2a/attachment-0001.obj>


More information about the Linuxppc-dev mailing list