[mm/debug_vm_pgtable] a97a171093: BUG:unable_to_handle_page_fault_for_address

Fri Jul 10 16:47:14 AEST 2020

On 07/09/2020 11:41 AM, kernel test robot wrote:
> [   94.349598] BUG: unable to handle page fault for address: ffffed10a7ffddff
> [   94.351039] #PF: supervisor read access in kernel mode
> [   94.352172] #PF: error_code(0x0000) - not-present page
> [   94.353256] PGD 43ffed067 P4D 43ffed067 PUD 43fdee067 PMD 0 
> [   94.354484] Oops: 0000 [#1] SMP KASAN
> [   94.355238] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc4-00002-ga97a17109332c #1
> [   94.360456] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [   94.361950] RIP: 0010:hugetlb_advanced_tests+0x137/0x699
> [   94.363026] Code: 8b 13 4d 85 f6 75 0b 48 ff 05 2c e4 6a 01 31 ed eb 41 bf f8 ff ff ff ba ff ff 37 00 4c 01 f7 48 c1 e2 2a 48 89 f9 48 c1 e9 03 <80> 3c 11 00 74 05 e8 cd c0 67 fa ba f8 ff ff ff 49 8b 2c 16 48 85
> [   94.366592] RSP: 0000:ffffc90000047d30 EFLAGS: 00010a06
> [   94.367693] RAX: 1ffffffff1049b80 RBX: ffff888380525308 RCX: 1ffff110a7ffddff
> [   94.369215] RDX: dffffc0000000000 RSI: 1ffff11087ffdc00 RDI: ffff88853ffeeff8
> [   94.370693] RBP: 000000000018e510 R08: 0000000000000025 R09: 0000000000000001
> [   94.372165] R10: ffff888380523c07 R11: ffffed10700a4780 R12: ffff88843208e510
> [   94.373674] R13: 0000000000000025 R14: ffff88843ffef000 R15: 000031e01ae61000
> [   94.375147] FS:  0000000000000000(0000) GS:ffff8883a3800000(0000) knlGS:0000000000000000
> [   94.376883] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   94.378051] CR2: ffffed10a7ffddff CR3: 0000000004e15000 CR4: 00000000000406a0
> [   94.379522] Call Trace:
> [   94.380073]  debug_vm_pgtable+0xd81/0x2029
> [   94.380871]  ? pmd_advanced_tests+0x621/0x621
> [   94.381819]  do_one_initcall+0x1eb/0xbd0
> [   94.382551]  ? trace_event_raw_event_initcall_finish+0x240/0x240
> [   94.383634]  ? rcu_read_lock_sched_held+0xb9/0x110
> [   94.388727]  ? rcu_read_lock_held+0xd0/0xd0
> [   94.389604]  ? __kasan_check_read+0x1d/0x30
> [   94.390485]  kernel_init_freeable+0x430/0x4f8
> [   94.391416]  ? rest_init+0x3f8/0x3f8
> [   94.392185]  kernel_init+0x14/0x1e8
> [   94.392918]  ret_from_fork+0x22/0x30
> [   94.393662] Modules linked in:
> [   94.394289] CR2: ffffed10a7ffddff
> [   94.395000] ---[ end trace 8ca5a1655dfb8c39 ]---

This bug is caused from here.

static inline struct mem_section *__nr_to_section(unsigned long nr)
{
#ifdef CONFIG_SPARSEMEM_EXTREME
        if (!mem_section)
                return NULL;
#endif
        if (!mem_section[SECTION_NR_TO_ROOT(nr)]) <-------- BUG
                return NULL;
        return &mem_section[SECTION_NR_TO_ROOT(nr)][nr & SECTION_ROOT_MASK];
}

static inline struct mem_section *__pfn_to_section(unsigned long pfn)
{
        return __nr_to_section(pfn_to_section_nr(pfn));
}

#define __pfn_to_page(pfn)                              \
({      unsigned long __pfn = (pfn);                    \
        struct mem_section *__sec = __pfn_to_section(__pfn);    \
        __section_mem_map_addr(__sec) + __pfn;          \
})

which is called via hugetlb_advanced_tests().

paddr = (__pfn_to_phys(pfn) | RANDOM_ORVALUE) & PMD_MASK;
pte = pte_mkhuge(mk_pte(pfn_to_page(PHYS_PFN(paddr)), prot));

Primary reason being RANDOM_ORVALUE, which is added to the paddr before
being masked with PMD_MASK. This clobbers up the pfn value which cannot
be searched in relevant memory sections. This problem stays hidden on
other configs where pfn_to_page() does not go via memory section search.
Dropping off RANDOM_ORVALUE solves the problem. Probably, just wanted to
drop that off during V2 series (https://lkml.org/lkml/2020/4/8/997) but
dont remember why ended up keeping it again.