[5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9
Sachin Sant
sachinp at linux.vnet.ibm.com
Sat Mar 14 19:10:05 AEDT 2020
> On 13-Mar-2020, at 5:05 PM, Vlastimil Babka <vbabka at suse.cz> wrote:
>
> On 3/13/20 12:12 PM, Srikar Dronamraju wrote:
>> * Michael Ellerman <mpe at ellerman.id.au> [2020-03-13 21:48:06]:
>>
>>> Sachin Sant <sachinp at linux.vnet.ibm.com> writes:
>>>>> The patch below might work. Sachin can you test this? I tried faking up
>>>>> a system with a memoryless node zero but couldn't get it to even start
>>>>> booting.
>>>>>
>>>> The patch did not help. The kernel crashed during
>>>> the boot with the same call trace.
>>>>
>>>> BUG_ON() introduced with the patch was not triggered.
>>>
>>> OK, that's weird.
>>>
>>> I eventually managed to get a memoryless node going in sim, and it
>>> appears to work there.
>>>
>>> eg in dmesg:
>>>
>>> [ 0.000000][ T0] numa: NODE_DATA [mem 0x2000fffa2f80-0x2000fffa7fff]
>>> [ 0.000000][ T0] numa: NODE_DATA(0) on node 1
>>> [ 0.000000][ T0] numa: NODE_DATA [mem 0x2000fff9df00-0x2000fffa2f7f]
>>> ...
>>> [ 0.000000][ T0] Early memory node ranges
>>> [ 0.000000][ T0] node 1: [mem 0x0000000000000000-0x00000000ffffffff]
>>> [ 0.000000][ T0] node 1: [mem 0x0000200000000000-0x00002000ffffffff]
>>> [ 0.000000][ T0] Could not find start_pfn for node 0
>>> [ 0.000000][ T0] Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000]
>>> [ 0.000000][ T0] On node 0 totalpages: 0
>>> [ 0.000000][ T0] Initmem setup node 1 [mem 0x0000000000000000-0x00002000ffffffff]
>>> [ 0.000000][ T0] On node 1 totalpages: 131072
>>>
>>> # dmesg | grep set_numa
>>> [ 0.000000][ T0] set_numa_mem: mem node for 0 = 1
>>> [ 0.005654][ T0] set_numa_mem: mem node for 1 = 1
>>>
>>> So is the problem more than just node zero having no memory?
>>>
I tried with just the patch Michael suggested on top of March 13 next tree.
I still see the same failure. Here is a snippet from the log
[ 0.000000] numa: NODE_DATA [mem 0x8bfedc900-0x8bfee3fff]
[ 0.000000] numa: NODE_DATA(0) on node 1
[ 0.000000] numa: NODE_DATA [mem 0x8bfed5200-0x8bfedc8ff]
[ 0.000000] rfi-flush: fallback displacement flush available
[ 0.000000] rfi-flush: mttrig type flush available
[ 0.000000] link-stack-flush: software flush enabled.
[ 0.000000] count-cache-flush: software flush disabled.
[ 0.000000] stf-barrier: eieio barrier available
[ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:0 block size:8
[ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:2 block size:8
[ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:0 psize:10 block size:8
[ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:2 psize:2 block size:8
[ 0.000000] lpar: H_BLOCK_REMOVE supports base psize:2 psize:10 block size:8
[ 0.000000] PPC64 nvram contains 15360 bytes
[ 0.000000] barrier-nospec: using ORI speculation barrier
[ 0.000000] Zone ranges:
[ 0.000000] Normal [mem 0x0000000000000000-0x00000008bfffffff]
[ 0.000000] Device empty
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 1: [mem 0x0000000000000000-0x00000008bfffffff]
[ 0.000000] Could not find start_pfn for node 0
[ 0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000]
[ 0.000000] Initmem setup node 1 [mem 0x0000000000000000-0x00000008bfffffff]
[ 0.000000] percpu: Embedded 11 pages/cpu s624024 r0 d96872 u1048576
[ 0.000000] Built 2 zonelists, mobility grouping on. Total pages: 572880
Have attached the complete boot log.
Thanks
-Sachin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kernel-boot.log
Type: application/octet-stream
Size: 20143 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20200314/139d2b83/attachment-0001.obj>
More information about the Linuxppc-dev
mailing list