[5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9

Sachin Sant sachinp at linux.vnet.ibm.com
Sat Mar 14 19:10:05 AEDT 2020



> On 13-Mar-2020, at 5:05 PM, Vlastimil Babka <vbabka at suse.cz> wrote:
> 
> On 3/13/20 12:12 PM, Srikar Dronamraju wrote:
>> * Michael Ellerman <mpe at ellerman.id.au> [2020-03-13 21:48:06]:
>> 
>>> Sachin Sant <sachinp at linux.vnet.ibm.com> writes:
>>>>> The patch below might work. Sachin can you test this? I tried faking up
>>>>> a system with a memoryless node zero but couldn't get it to even start
>>>>> booting.
>>>>> 
>>>> The patch did not help. The kernel crashed during
>>>> the boot with the same call trace.
>>>> 
>>>> BUG_ON() introduced with the patch was not triggered.
>>> 
>>> OK, that's weird.
>>> 
>>> I eventually managed to get a memoryless node going in sim, and it
>>> appears to work there.
>>> 
>>> eg in dmesg:
>>> 
>>>  [    0.000000][    T0] numa:   NODE_DATA [mem 0x2000fffa2f80-0x2000fffa7fff]
>>>  [    0.000000][    T0] numa:     NODE_DATA(0) on node 1
>>>  [    0.000000][    T0] numa:   NODE_DATA [mem 0x2000fff9df00-0x2000fffa2f7f]
>>>  ...
>>>  [    0.000000][    T0] Early memory node ranges
>>>  [    0.000000][    T0]   node   1: [mem 0x0000000000000000-0x00000000ffffffff]
>>>  [    0.000000][    T0]   node   1: [mem 0x0000200000000000-0x00002000ffffffff]
>>>  [    0.000000][    T0] Could not find start_pfn for node 0
>>>  [    0.000000][    T0] Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000]
>>>  [    0.000000][    T0] On node 0 totalpages: 0
>>>  [    0.000000][    T0] Initmem setup node 1 [mem 0x0000000000000000-0x00002000ffffffff]
>>>  [    0.000000][    T0] On node 1 totalpages: 131072
>>> 
>>>  # dmesg | grep set_numa
>>>  [    0.000000][    T0] set_numa_mem: mem node for 0 = 1
>>>  [    0.005654][    T0] set_numa_mem: mem node for 1 = 1
>>> 
>>> So is the problem more than just node zero having no memory?
>>> 

I tried with just the patch Michael suggested on top of March 13 next tree.
I still see the same failure. Here is a snippet from the log

[    0.000000] numa:   NODE_DATA [mem 0x8bfedc900-0x8bfee3fff]
[    0.000000] numa:     NODE_DATA(0) on node 1
[    0.000000] numa:   NODE_DATA [mem 0x8bfed5200-0x8bfedc8ff]
[    0.000000] rfi-flush: fallback displacement flush available
[    0.000000] rfi-flush: mttrig type flush available
[    0.000000] link-stack-flush: software flush enabled.
[    0.000000] count-cache-flush: software flush disabled.
[    0.000000] stf-barrier: eieio barrier available
[    0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:0 block size:8
[    0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:2 block size:8
[    0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:10 block size:8
[    0.000000] lpar: H_BLOCK_REMOVE supports base psize:2 psize:2 block size:8
[    0.000000] lpar: H_BLOCK_REMOVE supports base psize:2 psize:10 block size:8
[    0.000000] PPC64 nvram contains 15360 bytes
[    0.000000] barrier-nospec: using ORI speculation barrier
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000000000000-0x00000008bfffffff]
[    0.000000]   Device   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   1: [mem 0x0000000000000000-0x00000008bfffffff]
[    0.000000] Could not find start_pfn for node 0
[    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000]
[    0.000000] Initmem setup node 1 [mem 0x0000000000000000-0x00000008bfffffff]
[    0.000000] percpu: Embedded 11 pages/cpu s624024 r0 d96872 u1048576
[    0.000000] Built 2 zonelists, mobility grouping on.  Total pages: 572880

Have attached the complete boot log.

Thanks
-Sachin

-------------- next part --------------
A non-text attachment was scrubbed...
Name: kernel-boot.log
Type: application/octet-stream
Size: 20143 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20200314/139d2b83/attachment-0001.obj>


More information about the Linuxppc-dev mailing list