KASAN debug kernel fails to boot at early stage when CONFIG_SMP=y is set (kernel 6.5-rc5, PowerMac G4 3,6)

Michael Ellerman mpe at ellerman.id.au
Thu Aug 24 21:36:26 AEST 2023


Erhard Furtner <erhard_f at mailbox.org> writes:
> On Tue, 22 Aug 2023 07:31:54 +0000
> Christophe Leroy <christophe.leroy at csgroup.eu> wrote:
>
>> Le 18/08/2023 à 18:23, Erhard Furtner a écrit :
>> > On Fri, 18 Aug 2023 15:47:38 +0000
>> > Christophe Leroy <christophe.leroy at csgroup.eu> wrote:
>> >   
>> >> I'm wondering if the problem is just linked to the kernel being built
>> >> with CONFIG_SMP or if it is the actual startup of a secondary CPU that
>> >> cause the freeze.
>> >>
>> >> Please leave the btext_unmap() in place because I think it is important
>> >> to keep it, and start the kernel with the following parameter:
>> >>
>> >> nr_cpus=1  
>> > 
>> > With btext_unmap() back and place and nr_cpus=1 set the freeze still happens after the 1st btext_unmap:129 on cold boots:
>> > 
>> > [    0.000000] printk: bootconsole [udbg0] enabled
>> > [    0.000000] Total memory = 2048MB; using 4096kB for hash table
>> > [    0.000000] mapin_ram:125
>> > [    0.000000] mmu_mapin_ram:169 0 30000000 1400000 2000000
>> > [    0.000000] __mmu_mapin_ram:146 0 1400000
>> > [    0.000000] __mmu_mapin_ram:155 1400000
>> > [    0.000000] __mmu_mapin_ram:146 1400000 30000000
>> > [    0.000000] __mmu_mapin_ram:155 20000000
>> > [    0.000000] __mapin_ram_chunk:107 20000000 30000000
>> > [    0.000000] __mapin_ram_chunk:117
>> > [    0.000000] mapin_ram:134
>> > [    0.000000] kasan_mmu_init:129
>> > [    0.000000] kasan_mmu_init:132 0
>> > [    0.000000] kasan_mmu_init:137
>> > [    0.000000] btext_unmap:129
>> >   
>> 
>> Thanks,
>> 
>> Can you replace the call to btext_unmap() by a call to btext_map() at 
>> the end of MMU_init() ?
>> 
>> If that gives no interesting result, can you leave the call to 
>> btext_unmap() and add a call to btext_map() at the very begining of 
>> function start_kernel() in init/main.c (You may have to add a include of 
>> asm/btext.h)
>> 
>> With that I hope we can see more stuff.
>
> Ok, I tested out both methods.
>
>   1.) Replace btext_unmap() with btext_map() at the end of MMU_init().
>
> Warm boot again is unspectacular (attached). On cold boots I sometimes get:
>
> printk: bootconsole [udbg0] enabled
> Total memory = 2048MB; using 4096kB for hash table
> mapin_ram:125
> mmu_mapin_ram:169 0 30000000 1400000 2000000
> __mmu_mapin_ram:146 0 1400000
> __mmu_mapin_ram:155 1400000
> __mmu_mapin_ram:146 1400000 30000000
> __mmu_mapin_ram:155 20000000
> __mapin_ram_chunk:107 20000000 30000000
> __mapin_ram_chunk:117
> mapin_ram:134
> kasan_mmu_init:129
> kasan_mmu_init:132 0
> kasan_mmu_init:137
> ioremap() called early from btext_map+0x64/0xdc. Use early_ioremap() instead
> Linux version 6.5.0-rc7-PMacG4-dirty (root at T1000) (gcc (Gentoo 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #4 SMP Wed Aug 23 12:59:11 CEST 2023
>
> which shows one line (Linux version...) more than before. Most of the time I get this more interesting output however:
>
> kasan_mmu_init:129
> kasan_mmu_init:132 0
> kasan_mmu_init:137
> Linux version 6.5.0-rc7-PMacG4-dirty (root at T1000) (gcc (Gentoo 12.3.1_p20230526 p2) 12.3.1 20230526, GNU ld (Gentoo 2.40 p7) 2.40.0) #4 SMP Wed Aug 23 12:59:11 CEST 2023
> KASAN init done
> list_add corruption. prev->next should be next (c17100c0), but was 2c030000. (prev=c036ac7c).
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:30!
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 0 at arch/powerpc/include/asm/machdep.h:227 die+0xd8/0x39c

This is a WARN hit while handling the original bug.

Can you apply this patch to avoid that happening, so we can see the
original but better.

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index eeff136b83d9..341a0635e131 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -198,8 +198,6 @@ static unsigned long oops_begin(struct pt_regs *regs)
 	die_owner = cpu;
 	console_verbose();
 	bust_spinlocks(1);
-	if (machine_is(powermac))
-		pmac_backlight_unblank();
 	return flags;
 }
 NOKPROBE_SYMBOL(oops_begin);


cheers


More information about the Linuxppc-dev mailing list