[Bug 204789] New: Boot failure with more than 256G of memory

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Fri Sep 13 14:53:35 AEST 2019


Andrew Morton <akpm at linux-foundation.org> writes:

> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Sun, 08 Sep 2019 00:04:26 +0000 bugzilla-daemon at bugzilla.kernel.org wrote:
>
>> https://bugzilla.kernel.org/show_bug.cgi?id=204789
>> 
>>             Bug ID: 204789
>>            Summary: Boot failure with more than 256G of memory
>>            Product: Memory Management
>>            Version: 2.5
>>     Kernel Version: 5.2.x
>>           Hardware: PPC-64
>>                 OS: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: high
>>           Priority: P1
>>          Component: Other
>>           Assignee: akpm at linux-foundation.org
>>           Reporter: cam at neo-zeon.de
>>         Regression: No
>
> "Yes" :)
>
>> Kernel series 5.2.x will not boot on my Talos II workstation with dual POWER9
>> 18 core processors and 512G of physical memory with disable_radix=yes and 4k
>> pages.
>> 
>> 5.3-rc6 did not work either.
>> 
>> 5.1 and earlier boot fine. 
>
> Thanks.  It's probably best to report this on the powerpc list, cc'ed here.
>
>> I can get the system to boot IF I leave the Radix MMU enabled or if I boot a
>> kernel with 64k pages. I haven't yet tested enabling the Radix MMU with 64k
>> pages at the same time, but I suspect this would work. This is a system I
>> cannot take down TOO frequently.
>> 
>> The system will also boot with the Radix MMU disabled and 4k pages with 256G or
>> less memory. Setting mem on the kernel CLI to 256G or less results in a
>> successful boot. Setting mem=257G or higher no Radix MMU and 4k pages and the
>> kernel will not boot.
>> 
>> Petitboot comes up, but the system fails VERY early in boot in the serial
>> console with:
>> SIGTERM received, booting...
>> [   23.838858] kexec_core: Starting new kernel
>> 
>> Early printk is enabled, and it never progresses any further.
>> 
>> 5.1 boots just fine with the Radix MMU disabled and 4k pages.
>> 
>> Unfortunately, I currently need 4k pages for bcache to work, and Radix MMU
>> disabled in order for FreeBSD 12.x to work under KVM so I'm sticking with
>> 5.1.21 for now.
>> 
>> I have been unable to reproduce this issue in KVM.
>> 
>> Here are my PCIe peripherals:
>> 1. Microsemi/Adaptec HBA 1100-4i SAS controller
>> 2. Megaraid 9316-16i SAS RAID controller.
>> 
>> I've only tried little endian as this is a little endian install.

Will you be able to bisect this? I tried 4K PAGESIZE on P8 with upstream
kernel and I can't recreate the issuue.

[root at ltc ~]# free -g
              total        used        free      shared  buff/cache   available
Mem:            495           0         494           0           0         493
Swap:             0           0           0
[root at ltc ~]# getconf PAGESIZE
4096
[root at ltc ~]# grep Hash /proc/cpuinfo 
MMU             : Hash

I will see if I can get a P9 system with largemem

-aneesh


More information about the Linuxppc-dev mailing list