linux-next: powerpc le qemu boot failure after merge of the akpm tree

Christophe Leroy christophe.leroy at c-s.fr
Thu Jan 31 17:15:26 AEDT 2019



Le 31/01/2019 à 07:06, Stephen Rothwell a écrit :
> Hi all,
> 
> On Thu, 31 Jan 2019 16:38:54 +1100 Stephen Rothwell <sfr at canb.auug.org.au> wrote:
>>
>> [I am guessing that is is something in Andrew's tree that has caused
>> this.]
>>
>> My qemu boot of the powerpc pseries_le_defconfig config failed like this:
>>
>> htab_hash_mask    = 0x1ffff
>> -----------------------------------------------------
>> numa:   NODE_DATA [mem 0x7ffe7000-0x7ffebfff]
>> Kernel panic - not syncing: sparse_buffer_init: Failed to allocate 2147483648 bytes align=0x10000 nid=0 from=fffffffffffffff
>> CPU: 0 PID: 0 Comm: swapper Not tainted 5.0.0-rc4 #2
>> Call Trace:
>> [c00000000105bbd0] [c000000000b1345c] dump_stack+0xb0/0xf4 (unreliable)
>> [c00000000105bc10] [c000000000111120] panic+0x168/0x3b8
>> [c00000000105bcb0] [c000000000e701c8] sparse_init_nid+0x178/0x550
>> [c00000000105bd70] [c000000000e709b4] sparse_init+0x210/0x238
>> [c00000000105bdb0] [c000000000e468f4] initmem_init+0x1e0/0x260
>> [c00000000105be80] [c000000000e3b9b0] setup_arch+0x354/0x3d4
>> [c00000000105bef0] [c000000000e33afc] start_kernel+0x98/0x648
>> [c00000000105bf90] [c00000000000b270] start_here_common+0x1c/0x52c
> 
> A quick bisect leads to this:
> 
> 1c3c9328cde027eb875ba4692f0a5d66b0afe862 is the first bad commit
> commit 1c3c9328cde027eb875ba4692f0a5d66b0afe862
> Author: Mike Rapoport <rppt at linux.ibm.com>
> Date:   Thu Jan 31 10:51:32 2019 +1100
> 
>      treewide: add checks for the return value of memblock_alloc*()
>      
>      Add check for the return value of memblock_alloc*() functions and call
>      panic() in case of error.  The panic message repeats the one used by
>      panicing memblock allocators with adjustment of parameters to include only
>      relevant ones.
>      
>      The replacement was mostly automated with semantic patches like the one
>      below with manual massaging of format strings.
>      
>      @@
>      expression ptr, size, align;
>      @@
>      ptr = memblock_alloc(size, align);
>      + if (!ptr)
>      +       panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__,
>      size, align);
>      
>      Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.com
>      Signed-off-by: Mike Rapoport <rppt at linux.ibm.com>
>      Reviewed-by: Guo Ren <ren_guo at c-sky.com>                [c-sky]
>      Acked-by: Paul Burton <paul.burton at mips.com>            [MIPS]
>      Acked-by: Heiko Carstens <heiko.carstens at de.ibm.com>    [s390]
>      Reviewed-by: Juergen Gross <jgross at suse.com>            [Xen]
>      Reviewed-by: Geert Uytterhoeven <geert at linux-m68k.org>  [m68k]
>      Cc: Catalin Marinas <catalin.marinas at arm.com>
>      Cc: Christophe Leroy <christophe.leroy at c-s.fr>
>      Cc: Christoph Hellwig <hch at lst.de>
>      Cc: "David S. Miller" <davem at davemloft.net>
>      Cc: Dennis Zhou <dennis at kernel.org>
>      Cc: Greentime Hu <green.hu at gmail.com>
>      Cc: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
>      Cc: Guan Xuetao <gxt at pku.edu.cn>
>      Cc: Guo Ren <guoren at kernel.org>
>      Cc: Mark Salter <msalter at redhat.com>
>      Cc: Matt Turner <mattst88 at gmail.com>
>      Cc: Max Filippov <jcmvbkbc at gmail.com>
>      Cc: Michael Ellerman <mpe at ellerman.id.au>
>      Cc: Michal Simek <monstr at monstr.eu>
>      Cc: Petr Mladek <pmladek at suse.com>
>      Cc: Richard Weinberger <richard at nod.at>
>      Cc: Rich Felker <dalias at libc.org>
>      Cc: Rob Herring <robh+dt at kernel.org>
>      Cc: Rob Herring <robh at kernel.org>
>      Cc: Russell King <linux at armlinux.org.uk>
>      Cc: Stafford Horne <shorne at gmail.com>
>      Cc: Tony Luck <tony.luck at intel.com>
>      Cc: Vineet Gupta <vgupta at synopsys.com>
>      Cc: Yoshinori Sato <ysato at users.sourceforge.jp>
>      Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
> 
> Which is just adding the panic we hit.  So, presumably, the bug is in a
> preceding patch :-(
> 
> I have left the kernel not booting for today.
> 

No I think the error is really in that patch, see my other mail.

See 
https://elixir.bootlin.com/linux/v5.0-rc4/source/mm/memblock.c#L1455, 
memblock_alloc_try_nid_raw() is not supposed to panic, so the last hunk 
of this patch should be reverted.

Found in total three problematic hunks in that patch:

@@ -48,6 +53,11 @@ static phys_addr_t __init kasan_alloc_raw_page(int node)
  	void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE,
  						__pa(MAX_DMA_ADDRESS),
  						MEMBLOCK_ALLOC_KASAN, node);
+	if (!p)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n",
+		      __func__, PAGE_SIZE, PAGE_SIZE, node,
+		      __pa(MAX_DMA_ADDRESS));
+
  	return __pa(p);
  }

@@ -211,6 +211,9 @@ static int __init iob_init(struct device_node *dn)
  	iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
  					MEMBLOCK_LOW_LIMIT, 0x80000000,
  					NUMA_NO_NODE);
+	if (!iob_l2_base)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n",
+		      __func__, 1UL << 21, 1UL << 21, 0x80000000);

  	pr_info("IOBMAP L2 allocated at: %p\n", iob_l2_base);


@@ -425,6 +436,10 @@ static void __init sparse_buffer_init(unsigned long 
size, int nid)
  		memblock_alloc_try_nid_raw(size, PAGE_SIZE,
  						__pa(MAX_DMA_ADDRESS),
  						MEMBLOCK_ALLOC_ACCESSIBLE, nid);
+	if (!sparsemap_buf)
+		panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%lx\n",
+		      __func__, size, PAGE_SIZE, nid, __pa(MAX_DMA_ADDRESS));
+
  	sparsemap_buf_end = sparsemap_buf + size;
  }



Christophe


More information about the Linuxppc-dev mailing list