[PATCH v2] powerpc/mm: Update default hugetlb size early

Aneesh Kumar K.V aneesh.kumar at linux.ibm.com
Sat Feb 12 01:40:51 AEDT 2022


Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com> writes:

> David Hildenbrand <david at redhat.com> writes:
>
>> On 11.02.22 10:16, Aneesh Kumar K V wrote:
>>> On 2/11/22 14:00, David Hildenbrand wrote:
>>>> On 11.02.22 07:52, Aneesh Kumar K.V wrote:
>>>>> commit: d9c234005227 ("Do not depend on MAX_ORDER when grouping pages by mobility")
....
....

> I could build a kernel with FORCE_MAX_ZONEORDER=8 and pageblock_order =
> 8. We need to disable THP for such a kernel to boot, because THP do
> check for PMD_ORDER < MAX_ORDER. I was able to boot that kernel on a
> virtualized platform, but then gigantic_page_runtime_supported is not
> supported on such config with hash translation.
>
> On non virtualized platform I am hitting crashes like below during boot.
>
> [   47.637865][   C42] =============================================================================                                                                                                                                                                                                              
> [   47.637907][   C42] BUG pgtable-2^11 (Not tainted): Object already free                                                                                     
> [   47.637925][   C42] -----------------------------------------------------------------------------                                                           
> [   47.637925][   C42]                                                                                                                                         
> [   47.637945][   C42] Allocated in __pud_alloc+0x84/0x2a0 age=278 cpu=40 pid=1409                                                                             
> [   47.637974][   C42]  __slab_alloc.isra.0+0x40/0x60                                                                                                          
> [   47.637995][   C42]  kmem_cache_alloc+0x1a8/0x510                                                                                                           
> [   47.638010][   C42]  __pud_alloc+0x84/0x2a0                                                                                                                 
> [   47.638024][   C42]  copy_page_range+0x38c/0x1b90                                                                                                           
> [   47.638040][   C42]  dup_mm+0x548/0x880                                                                                                                     
> [   47.638058][   C42]  copy_process+0xdc0/0x1e90                                                                                                              
> [   47.638076][   C42]  kernel_clone+0xd4/0x9d0                                                                                                                
> [   47.638094][   C42]  __do_sys_clone+0x88/0xe0                                                                                                               
> [   47.638112][   C42]  system_call_exception+0x368/0x3a0                                                                                                      
> [   47.638128][   C42]  system_call_common+0xec/0x250                                                                                                          
> [   47.638147][   C42] Freed in __tlb_remove_table+0x1d4/0x200 age=263 cpu=57 pid=326                                                                          
> [   47.638172][   C42]  kmem_cache_free+0x44c/0x680                                                                                                            
> [   47.638187][   C42]  __tlb_remove_table+0x1d4/0x200                                                                                                         
> [   47.638204][   C42]  tlb_remove_table_rcu+0x54/0xa0                                                                                                         
> [   47.638222][   C42]  rcu_core+0xdd4/0x15d0                                                                                                                  
> [   47.638239][   C42]  __do_softirq+0x360/0x69c                                                                                                               
> [   47.638257][   C42]  run_ksoftirqd+0x54/0xc0                                                                                                                
> [   47.638273][   C42]  smpboot_thread_fn+0x28c/0x2f0                                                                                                          
> [   47.638290][   C42]  kthread+0x1a4/0x1b0                                                                                                                    
> [   47.638305][   C42]  ret_from_kernel_thread+0x5c/0x64                                                                                                       
> [   47.638320][   C42] Slab 0xc00c00000000d600 objects=10 used=9 fp=0xc0000000035a8000 flags=0x7ffff000010201(locked|slab|head|node=0|zone=0|lastcpupid=0x7ffff)                                                                                                                                                              
> [   47.638352][   C42] Object 0xc0000000035a8000 @offset=163840 fp=0x0000000000000000                                                                          
> [   47.638352][   C42]                                                                                                                                         
> [   47.638373][   C42] Redzone  c0000000035a4000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638394][   C42] Redzone  c0000000035a4010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638414][   C42] Redzone  c0000000035a4020: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638435][   C42] Redzone  c0000000035a4030: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638455][   C42] Redzone  c0000000035a4040: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638474][   C42] Redzone  c0000000035a4050: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638494][   C42] Redzone  c0000000035a4060: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638514][   C42] Redzone  c0000000035a4070: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            
> [   47.638534][   C42] Redzone  c0000000035a4080: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb  ................                                            

Ok that turned out to be unrelated. I was using a wrong kernel. I can
boot kernel with pageblock_order > MAX_ORDER and run hugetlb related
test fine. I do get the below warning which you had already called out
in your patch.

[    3.952124] WARNING: CPU: 16 PID: 719 at mm/vmstat.c:1103 __fragmentation_index+0x14/0x70                                                                   
[    3.952136] Modules linked in:                                                                                                                              
[    3.952141] CPU: 16 PID: 719 Comm: kswapd0 Tainted: G    B             5.17.0-rc3-00044-g69052ffa0e08 #68                                                   
[    3.952149] NIP:  c000000000465264 LR: c000000000468544 CTR: 0000000000000000                                                                               
[    3.952154] REGS: c000000014a4f7e0 TRAP: 0700   Tainted: G    B              (5.17.0-rc3-00044-g69052ffa0e08)
[    3.952161] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 44042422  XER: 20000000
[    3.952174] CFAR: c000000000468540 IRQMASK: 0                  
               GPR00: c000000000468544 c000000014a4fa80 c000000001ea9500 0000000000000008 
               GPR04: c000000014a4faa0 00000000001fd700 0000000000004003 00000000001fd92d 
               GPR08: c000001fffd1c7a0 0000000000000008 0000000000000008 0000000000000000 
               GPR12: 0000000000002200 c000001fffff2880 0000000000000000 c000000013cfd240                                                                      
               GPR16: c000000011940600 c000001fffd21058 0000000000000d00 c000000001407d30                                                                      
               GPR20: ffffffffffffffaf c000001fffd21098 0000000000000000 c000000002ab7328                                                                      
               GPR24: c000000011940600 c000001fffd21300 0000000000000000 0000000000000008 
               GPR28: c000001fffd1c280 0000000000000008 0000000000000000 0000000000000004                                                                      
[    3.952231] NIP [c000000000465264] __fragmentation_index+0x14/0x70                                                                                          
[    3.952237] LR [c000000000468544] fragmentation_index+0xb4/0xe0                                                                                             
[    3.952244] Call Trace:                                        
[    3.952247] [c000000014a4fa80] [c00000000023e248] lock_release+0x138/0x470 (unreliable)
[    3.952256] [c000000014a4fac0] [c00000000047cd84] compaction_suitable+0x94/0x270
[    3.952263] [c000000014a4fb10] [c0000000004802b8] wakeup_kcompactd+0xc8/0x2a0
[    3.952270] [c000000014a4fb60] [c000000000457568] balance_pgdat+0x798/0x8d0
[    3.952277] [c000000014a4fca0] [c000000000457d14] kswapd+0x674/0x7b0                                                                                        
[    3.952283] [c000000014a4fdc0] [c0000000001d7e84] kthread+0x144/0x150                                                                                       
[    3.952290] [c000000014a4fe10] [c00000000000cd74] ret_from_kernel_thread+0x5c/0x64
[    3.952297] Instruction dump:                                      
[    3.952301] 7d2021ad 40c2fff4 e8ed0030 38a00000 7caa39ae 4e800020 60000000 7c0802a6 
[    3.952311] 60000000 28030007 7c6a1b78 40810010 <0fe00000> 60000000 60000000 e9040008 
[    3.952322] irq event stamp: 0                                        
[    3.952325] hardirqs last  enabled at (0): [<0000000000000000>] 0x0                                                                                         
[    3.952331] hardirqs last disabled at (0): [<c000000000196030>] copy_process+0x970/0x1de0                                                                   
[    3.952339] softirqs last  enabled at (0): [<c000000000196030>] copy_process+0x970/0x1de0                                                                   
[    3.952345] softirqs last disabled at (0): [<0000000000000000>] 0x0                                                                                         

I am not sure whether there is any value in selecting MAX_ORDER = 8 on
ppc64. If not we could do a patch as below for ppc64.

commit 09ed79c4fda92418914546f36c2750670503d7a0
Author: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
Date:   Fri Feb 11 17:15:10 2022 +0530

    powerpc/mm: Disable MAX_ORDER value 8 on book3s64 with 64K pagesize
    
    With transparent hugepage support we expect HPAGE_PMD_ORDER < MAX_ORDER.
    Without this we BUG() during boot as below
    
    cpu 0x6: Vector: 700 (Program Check) at [c000000012143880]
        pc: c000000001b4ddbc: hugepage_init+0x108/0x2c4
        lr: c000000001b4dd98: hugepage_init+0xe4/0x2c4
        sp: c000000012143b20
       msr: 8000000002029033
      current = 0xc0000000120d0f80
      paca    = 0xc00000001ec7e900   irqmask: 0x03   irq_happened: 0x01
        pid   = 1, comm = swapper/0
    kernel BUG at mm/huge_memory.c:413!
    [c000000012143b20] c0000000022c0468 blacklisted_initcalls+0x120/0x1c8 (unreliable)
    [c000000012143bb0] c000000000012104 do_one_initcall+0x94/0x520
    [c000000012143c90] c000000001b04da0 kernel_init_freeable+0x444/0x508
    [c000000012143da0] c000000000012d8c kernel_init+0x44/0x188
    [c000000012143e10] c00000000000cbf4 ret_from_kernel_thread+0x5c/0x64
    
    Hence a FORCE_MAX_ZONEORDER of value < 9 doesn't make sense with THP
    enabled. We also cannot have value > 9 because we are limitted by
    SECTION_SIZE_BITS
    
     #if (MAX_ORDER - 1 + PAGE_SHIFT) > SECTION_SIZE_BITS
     #error Allocator MAX_ORDER exceeds SECTION_SIZE
     #endif
    
    We can select MAX_ORDER value 8 by disabling THP support but then that
    results in pageblock_order > MAX_ORDER - 1 which is not fully tested/supported.
    
    Cc: David Hildenbrand <david at redhat.com>
    Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b779603978e1..a050f5f46df3 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -807,7 +807,7 @@ config DATA_SHIFT
 
 config FORCE_MAX_ZONEORDER
 	int "Maximum zone order"
-	range 8 9 if PPC64 && PPC_64K_PAGES
+	range 9 9 if PPC64 && PPC_64K_PAGES
 	default "9" if PPC64 && PPC_64K_PAGES
 	range 13 13 if PPC64 && !PPC_64K_PAGES
 	default "13" if PPC64 && !PPC_64K_PAGES



More information about the Linuxppc-dev mailing list