[Skiboot] [PATCH] external/mambo: Disable MEMORY_OVERFLOW
Michael Ellerman
mpe at ellerman.id.au
Thu Jul 2 17:03:22 AEST 2020
Gustavo Romero <gromero at linux.vnet.ibm.com> writes:
> On 6/28/20 10:27 PM, Michael Ellerman wrote:
>> Gustavo Romero <gromero at linux.vnet.ibm.com> writes:
>>> On 6/25/20 8:56 AM, Michael Ellerman wrote:
>>>> Mambo has a strange feature called MEMORY_OVERFLOW, enabled by
>>>> default, which causes some accesses to non-existent memory addresses
>>>> to transparently "create" memory.
>>>>
>>>> This can be confusing when debugging, eg:
>>>>
>>>> systemsim % mysim cpu 0 display spr pc
>>>> 0xC0000000000246B8
>>>> systemsim % mysim memory display 0xC0000000000246B8 8
>>>> 0x0000000000000000
>>>>
>>>> Appears to show that the memory at pc (NIP) is currently zeroes.
>>>>
>>>> The astute observer will note that "mysim memory display" takes
>>>> physical addresses, not effective addresses. So unless this machine
>>>> has > 12XB of RAM, this access should have failed as there is no
>>>> memory at that address.
>>>>
>>>> Turning MEMORY_OVERFLOW off gives us a much more sensible result:
>>>>
>>>> systemsim % mysim memory display 0xC0000000000246B8 8
>>>> Illegal Address 0xC0000000000246B8
>>>>
>>>> It doesn't appear to have any effect on accesses done from Linux, with
>>>> the setting enabled or disabled we still get a machine check for bad
>>>> accesses in real mode:
>>>
>>> With that change applied, on mambo P10 running on a POWER8 I'm getting
>>> the following mambo exception that forbids the kernel to continue booting:
>>
>> This looks like exactly the kind of thing we want to catch, so that's
>> "good" :)
>>
>>> [...]
>>> 142233280: (536372251): [ 0.001554] printk: bootconsole [udbg0] disabled
>>> 142387801: (537126772): [ 0.001870] mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl
>>> 142412629: (537151600): [ 0.001919] pid_max: default: 32768 minimum: 301
>>> 142549799: (537888770): [ 0.002187] Mount-cache hash table entries: 16384 (order: 1, 131072 bytes, linear)
>>> 142570582: (537909553): [ 0.002228] Mountpoint-cache hash table entries: 16384 (order: 1, 131072 bytes, linear)
>>> 143544723: (541883694): [ 0.004130] EEH: PowerNV platform initialized
>>> 143557364: (541896335): [ 0.004155] POWER9 performance monitor hardware support registered
>>> 143629574: (541968545): [ 0.004296] rcu: Hierarchical SRCU implementation.
>>> 143904253: (543143224): [ 0.004833] smp: Bringing up secondary CPUs ...
>>> WARNING: 145271326: (548660031): Write_Mapped_Memory_Reg: Unknown address: 0x00000E995A3AF7B0, length=8
>>> FATAL ERROR: 145271326: (548660031): Attempt to store non-existent address 0x00000E995A3AF7B0
>>> INFO: 145271326: (548660032): ** Execution stopped: Mambo Error, **
>>> 145271326: ** finished running 548660032 instructions **
>>
>> Can you see where the bad store came from, "bt" should give you a backtrace.
>>
>> Then "p pc" should give you the PC and "di <that value>" should show us
>> what instruction it was.
>
> PC itself is parked, believe it or not, at a 'nop' instruction:
Hmm, maybe we're looking at the wrong CPU/thread?
> 151630615: (157638813): [ 0.040390] pstore: Registered nvram as persistent store backend
> 152359208: (158398639): [ 0.041813] PCI: Probing PCI hardware
> 152666145: (158718713): [ 0.042412] audit: type=2000 audit(1024007219.010:1): state=initialized audit_enabled=0 res=1
> 153055303: (159124575): [ 0.043172] cpuidle-powernv: Default stop: psscr = 0x0000000000000300,mask=0x00000000003003ff
> 153074775: (159144887): [ 0.043210] cpuidle-powernv: Deepest stop: psscr = 0x0000000000300322,mask=0x00000000003003ff
> 153093510: (159164435): [ 0.043247] cpuidle-powernv: First stop level that may lose SPRs = 0x10
> 153108589: (159180151): [ 0.043276] cpuidle-powernv: First stop level that may lose timebase = 0x10
> WARNING: 156265663: (162472569): Write_Mapped_Memory_Reg: Unknown address: 0x00000BEE18C96D60, length=8
> FATAL ERROR: 156265663: (162472569): Attempt to store non-existent address 0x00000BEE18C96D60
> INFO: 156265663: (162472570): ** Execution stopped: Mambo Error, **
> 156265663: ** finished running 162472570 instructions **
I notice this doesn't tell us what CPU caused the bad access.
> systemsim % p pc
> 0xC0000000003A2EC4
"p" implicitly shows CPU 0 unless you've used the "target" command.
> systemsim % di 0xC0000000003A2EC4
> WARNING: 156265663: (162472570): Need to define a CPU
> WARNING: 156265663: (162472570): Need to define a CPU
> EADDR:0xC0000000003A2EC0 RADDR:0x003A2EC0 Enc:0x2C4A007C : dcbt r0,r9,0
> WARNING: 156265663: (162472570): Need to define a CPU
> WARNING: 156265663: (162472570): Need to define a CPU
> EADDR:0xC0000000003A2EC4 RADDR:0x003A2EC4 Enc:0x00000060 : nop
> WARNING: 156265663: (162472570): Need to define a CPU
> WARNING: 156265663: (162472570): Need to define a CPU
> EADDR:0xC0000000003A2EC8 RADDR:0x003A2EC8 Enc:0x00000060 : nop
> WARNING: 156265663: (162472570): Need to define a CPU
> WARNING: 156265663: (162472570): Need to define a CPU
>
> But just before the 'nop' there is a dcbt. But address passed to the dcbt,
> in GPR 9, doesn't contain anything close to the address displayed by mambo:
>
> systemsim % mysim cpu 0:0:0 display gpr 9
> 0x0000000000000240
> systemsim % mysim mcm 0 cpu 0 thread 0 dtranslate 0x240
> data address translation for 0x0000000000000240 failed
> systemsim %
>
> The dcbt instruction is in mm/slub.c; more context:
>
> 1241956 prefetch(object + s->offset);
> 1241957 c0000000003a2eb4: 20 00 3f 81 lwz r9,32(r31)
> 1241958 if (unlikely(!x))
> 1241959 c0000000003a2eb8: 15 4a 3a 7d add. r9,r26,r9
> 1241960 c0000000003a2ebc: 08 00 82 41 beq c0000000003a2ec4 <kmem_cache_alloc+0x114>
> 1241961 __asm__ __volatile__ ("dcbt 0,%0" : : "r" (x));
> 1241962 c0000000003a2ec0: 2c 4a 00 7c dcbt 0,r9
> 1241963 c0000000003a2ec4: 00 00 00 60 nop
> 1241964 c0000000003a2ec8: 00 00 00 60 nop
>
> so probably from a prefetch_freepointer() in
> https://github.com/torvalds/linux/blob/master/mm/slub.c#L2815
But we shouldn't be prefetching 0x240, that's a userspace address. So I
suspect something has gone wrong with the debug here.
>>> systemsim % c
>>> 145785082: (550674532): [ 0.008506] smp: Brought up 2 nodes, 4 CPUs
>>> 145798101: (550687551): [ 0.008527] numa: Node 0 CPUs: 0-1
>>> 145810514: (550799964): [ 0.008551] numa: Node 1 CPUs: 2-3
>>
>> Does it still happen with a single CPU?
>
> It still happens with maxcpus=0 or =1. However if I disable radix passing
> disable_radix=1 to command line I'm able to boot.
Hmm OK.
> Today I've built upstream mambo and the same issue happens. I'm clueless
> yet what's happening... so if you have additional things to try let
> me know. It might be an issue with Mambo P10 running on P8.
Just to be clear it doesn't happen with mambo simulating P9 right?
cheers
More information about the Skiboot
mailing list