[PATCH v2] powerpc/mm: Update default hugetlb size early
Aneesh Kumar K.V
aneesh.kumar at linux.ibm.com
Sat Feb 12 01:40:51 AEDT 2022
Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com> writes:
> David Hildenbrand <david at redhat.com> writes:
>
>> On 11.02.22 10:16, Aneesh Kumar K V wrote:
>>> On 2/11/22 14:00, David Hildenbrand wrote:
>>>> On 11.02.22 07:52, Aneesh Kumar K.V wrote:
>>>>> commit: d9c234005227 ("Do not depend on MAX_ORDER when grouping pages by mobility")
....
....
> I could build a kernel with FORCE_MAX_ZONEORDER=8 and pageblock_order =
> 8. We need to disable THP for such a kernel to boot, because THP do
> check for PMD_ORDER < MAX_ORDER. I was able to boot that kernel on a
> virtualized platform, but then gigantic_page_runtime_supported is not
> supported on such config with hash translation.
>
> On non virtualized platform I am hitting crashes like below during boot.
>
> [ 47.637865][ C42] =============================================================================
> [ 47.637907][ C42] BUG pgtable-2^11 (Not tainted): Object already free
> [ 47.637925][ C42] -----------------------------------------------------------------------------
> [ 47.637925][ C42]
> [ 47.637945][ C42] Allocated in __pud_alloc+0x84/0x2a0 age=278 cpu=40 pid=1409
> [ 47.637974][ C42] __slab_alloc.isra.0+0x40/0x60
> [ 47.637995][ C42] kmem_cache_alloc+0x1a8/0x510
> [ 47.638010][ C42] __pud_alloc+0x84/0x2a0
> [ 47.638024][ C42] copy_page_range+0x38c/0x1b90
> [ 47.638040][ C42] dup_mm+0x548/0x880
> [ 47.638058][ C42] copy_process+0xdc0/0x1e90
> [ 47.638076][ C42] kernel_clone+0xd4/0x9d0
> [ 47.638094][ C42] __do_sys_clone+0x88/0xe0
> [ 47.638112][ C42] system_call_exception+0x368/0x3a0
> [ 47.638128][ C42] system_call_common+0xec/0x250
> [ 47.638147][ C42] Freed in __tlb_remove_table+0x1d4/0x200 age=263 cpu=57 pid=326
> [ 47.638172][ C42] kmem_cache_free+0x44c/0x680
> [ 47.638187][ C42] __tlb_remove_table+0x1d4/0x200
> [ 47.638204][ C42] tlb_remove_table_rcu+0x54/0xa0
> [ 47.638222][ C42] rcu_core+0xdd4/0x15d0
> [ 47.638239][ C42] __do_softirq+0x360/0x69c
> [ 47.638257][ C42] run_ksoftirqd+0x54/0xc0
> [ 47.638273][ C42] smpboot_thread_fn+0x28c/0x2f0
> [ 47.638290][ C42] kthread+0x1a4/0x1b0
> [ 47.638305][ C42] ret_from_kernel_thread+0x5c/0x64
> [ 47.638320][ C42] Slab 0xc00c00000000d600 objects=10 used=9 fp=0xc0000000035a8000 flags=0x7ffff000010201(locked|slab|head|node=0|zone=0|lastcpupid=0x7ffff)
> [ 47.638352][ C42] Object 0xc0000000035a8000 @offset=163840 fp=0x0000000000000000
> [ 47.638352][ C42]
> [ 47.638373][ C42] Redzone c0000000035a4000: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 47.638394][ C42] Redzone c0000000035a4010: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 47.638414][ C42] Redzone c0000000035a4020: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 47.638435][ C42] Redzone c0000000035a4030: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 47.638455][ C42] Redzone c0000000035a4040: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 47.638474][ C42] Redzone c0000000035a4050: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 47.638494][ C42] Redzone c0000000035a4060: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 47.638514][ C42] Redzone c0000000035a4070: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
> [ 47.638534][ C42] Redzone c0000000035a4080: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................
Ok that turned out to be unrelated. I was using a wrong kernel. I can
boot kernel with pageblock_order > MAX_ORDER and run hugetlb related
test fine. I do get the below warning which you had already called out
in your patch.
[ 3.952124] WARNING: CPU: 16 PID: 719 at mm/vmstat.c:1103 __fragmentation_index+0x14/0x70
[ 3.952136] Modules linked in:
[ 3.952141] CPU: 16 PID: 719 Comm: kswapd0 Tainted: G B 5.17.0-rc3-00044-g69052ffa0e08 #68
[ 3.952149] NIP: c000000000465264 LR: c000000000468544 CTR: 0000000000000000
[ 3.952154] REGS: c000000014a4f7e0 TRAP: 0700 Tainted: G B (5.17.0-rc3-00044-g69052ffa0e08)
[ 3.952161] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 44042422 XER: 20000000
[ 3.952174] CFAR: c000000000468540 IRQMASK: 0
GPR00: c000000000468544 c000000014a4fa80 c000000001ea9500 0000000000000008
GPR04: c000000014a4faa0 00000000001fd700 0000000000004003 00000000001fd92d
GPR08: c000001fffd1c7a0 0000000000000008 0000000000000008 0000000000000000
GPR12: 0000000000002200 c000001fffff2880 0000000000000000 c000000013cfd240
GPR16: c000000011940600 c000001fffd21058 0000000000000d00 c000000001407d30
GPR20: ffffffffffffffaf c000001fffd21098 0000000000000000 c000000002ab7328
GPR24: c000000011940600 c000001fffd21300 0000000000000000 0000000000000008
GPR28: c000001fffd1c280 0000000000000008 0000000000000000 0000000000000004
[ 3.952231] NIP [c000000000465264] __fragmentation_index+0x14/0x70
[ 3.952237] LR [c000000000468544] fragmentation_index+0xb4/0xe0
[ 3.952244] Call Trace:
[ 3.952247] [c000000014a4fa80] [c00000000023e248] lock_release+0x138/0x470 (unreliable)
[ 3.952256] [c000000014a4fac0] [c00000000047cd84] compaction_suitable+0x94/0x270
[ 3.952263] [c000000014a4fb10] [c0000000004802b8] wakeup_kcompactd+0xc8/0x2a0
[ 3.952270] [c000000014a4fb60] [c000000000457568] balance_pgdat+0x798/0x8d0
[ 3.952277] [c000000014a4fca0] [c000000000457d14] kswapd+0x674/0x7b0
[ 3.952283] [c000000014a4fdc0] [c0000000001d7e84] kthread+0x144/0x150
[ 3.952290] [c000000014a4fe10] [c00000000000cd74] ret_from_kernel_thread+0x5c/0x64
[ 3.952297] Instruction dump:
[ 3.952301] 7d2021ad 40c2fff4 e8ed0030 38a00000 7caa39ae 4e800020 60000000 7c0802a6
[ 3.952311] 60000000 28030007 7c6a1b78 40810010 <0fe00000> 60000000 60000000 e9040008
[ 3.952322] irq event stamp: 0
[ 3.952325] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 3.952331] hardirqs last disabled at (0): [<c000000000196030>] copy_process+0x970/0x1de0
[ 3.952339] softirqs last enabled at (0): [<c000000000196030>] copy_process+0x970/0x1de0
[ 3.952345] softirqs last disabled at (0): [<0000000000000000>] 0x0
I am not sure whether there is any value in selecting MAX_ORDER = 8 on
ppc64. If not we could do a patch as below for ppc64.
commit 09ed79c4fda92418914546f36c2750670503d7a0
Author: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
Date: Fri Feb 11 17:15:10 2022 +0530
powerpc/mm: Disable MAX_ORDER value 8 on book3s64 with 64K pagesize
With transparent hugepage support we expect HPAGE_PMD_ORDER < MAX_ORDER.
Without this we BUG() during boot as below
cpu 0x6: Vector: 700 (Program Check) at [c000000012143880]
pc: c000000001b4ddbc: hugepage_init+0x108/0x2c4
lr: c000000001b4dd98: hugepage_init+0xe4/0x2c4
sp: c000000012143b20
msr: 8000000002029033
current = 0xc0000000120d0f80
paca = 0xc00000001ec7e900 irqmask: 0x03 irq_happened: 0x01
pid = 1, comm = swapper/0
kernel BUG at mm/huge_memory.c:413!
[c000000012143b20] c0000000022c0468 blacklisted_initcalls+0x120/0x1c8 (unreliable)
[c000000012143bb0] c000000000012104 do_one_initcall+0x94/0x520
[c000000012143c90] c000000001b04da0 kernel_init_freeable+0x444/0x508
[c000000012143da0] c000000000012d8c kernel_init+0x44/0x188
[c000000012143e10] c00000000000cbf4 ret_from_kernel_thread+0x5c/0x64
Hence a FORCE_MAX_ZONEORDER of value < 9 doesn't make sense with THP
enabled. We also cannot have value > 9 because we are limitted by
SECTION_SIZE_BITS
#if (MAX_ORDER - 1 + PAGE_SHIFT) > SECTION_SIZE_BITS
#error Allocator MAX_ORDER exceeds SECTION_SIZE
#endif
We can select MAX_ORDER value 8 by disabling THP support but then that
results in pageblock_order > MAX_ORDER - 1 which is not fully tested/supported.
Cc: David Hildenbrand <david at redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar at linux.ibm.com>
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b779603978e1..a050f5f46df3 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -807,7 +807,7 @@ config DATA_SHIFT
config FORCE_MAX_ZONEORDER
int "Maximum zone order"
- range 8 9 if PPC64 && PPC_64K_PAGES
+ range 9 9 if PPC64 && PPC_64K_PAGES
default "9" if PPC64 && PPC_64K_PAGES
range 13 13 if PPC64 && !PPC_64K_PAGES
default "13" if PPC64 && !PPC_64K_PAGES
More information about the Linuxppc-dev
mailing list