Fragmented physical memory on powerpc/32

Christophe Leroy christophe.leroy at csgroup.eu
Tue Sep 13 16:11:39 AEST 2022



Le 12/09/2022 à 23:16, Pali Rohár a écrit :
> On Monday 12 September 2022 15:48:05 Mike Rapoport wrote:
>> On Sat, Sep 10, 2022 at 09:39:20AM +0000, Christophe Leroy wrote:
>>> + Adding Mike who might help if the problem is around memblock.
>>>
>>> Le 08/09/2022 à 22:17, Pali Rohár a écrit :
>>>> On Thursday 08 September 2022 17:35:11 Pali Rohár wrote:
>>>>> On Thursday 08 September 2022 15:25:14 Christophe Leroy wrote:
>>>>>> Le 08/08/2022 à 20:40, Pali Rohár a écrit :
>>>>>>> On Friday 10 June 2022 00:24:20 Pali Rohár wrote:
>>>>>>>> On Friday 20 May 2022 14:30:02 Pali Rohár wrote:
>>>>>>>>> + linux-mm
>>>>>>>>>
>>>>>>>>> Do you know what are requirements for kernel to support non-contiguous
>>>>>>>>> memory support and what is needed to enable it for 32-bit powerpc?
>>>>>>>>
>>>>>>>> Any hints?
>>>>>>>
>>>>>>> PING?
>>>>>>>
>>>>>>
>>>>>> The tree following patches landed in powerpc/next branch, so they should
>>>>>> soon be visible in linux-next too:
>>>>>>
>>>>>> fc06755e2562 ("powerpc/32: Drop a stale comment about reservation of
>>>>>> gigantic pages")
>>>>>> b0e0d68b1c52 ("powerpc/32: Allow fragmented physical memory")
>>>>>> 0115953dcebe ("powerpc/32: Remove wii_memory_fixups()")
>>>>>
>>>>> Ou, nice! I will try to test it if it allows me to access more than 2GB
>>>>> of RAM from 4GB DDR3 module with 32-bit addressing mode on P2020 CPU.
>>>>
>>>> Hello! Ok, I have tried it from powerpc/next branch, but seems it does
>>>> not work. I'm getting just early kernel crash.
>>>>
>>>> [    0.000000] CPU maps initialized for 1 thread per core
>>>> [    0.000000]  (thread shift is 0)
>>>> [    0.000000] -----------------------------------------------------
>>>> [    0.000000] phys_mem_size     = 0xbe500000
>>>> [    0.000000] dcache_bsize      = 0x20
>>>> [    0.000000] icache_bsize      = 0x20
>>>> [    0.000000] cpu_features      = 0x0000000010010108
>>>> [    0.000000]   possible        = 0x0000000010010108
>>>> [    0.000000]   always          = 0x0000000010010108
>>>> [    0.000000] cpu_user_features = 0x84e08000 0x08000000
>>>> [    0.000000] mmu_features      = 0x00020010
>>>> [    0.000000] -----------------------------------------------------
>>>> mpc85xx_rdb_setup_arch()
>>>> [    0.000000] ioremap() called early from of_iomap+0x48/0x80. Use early_ioremap() instead
>>>> [    0.000000] MPC85xx RDB board from Freescale Semiconductor
>>>> [    0.000000] barrier-nospec: using isync; sync as speculation barrier
>>>> [    0.000000] barrier-nospec: patched 182 locations
>>>> [    0.000000] Top of RAM: 0xff700000, Total RAM: 0xbe500000
>>>> [    0.000000] Memory hole size: 1042MB
>>>> [    0.000000] Zone ranges:
>>>> [    0.000000]   Normal   [mem 0x0000000000000000-0x000000002fffffff]
>>>> [    0.000000]   HighMem  [mem 0x0000000030000000-0x00000000ff6fffff]
>>>> [    0.000000] Movable zone start for each node
>>>> [    0.000000] Early memory node ranges
>>>> [    0.000000]   node   0: [mem 0x0000000000000000-0x000000007fffffff]
>>>> [    0.000000]   node   0: [mem 0x00000000c0200000-0x00000000eeffffff]
>>>> [    0.000000]   node   0: [mem 0x00000000f0000000-0x00000000ff6fffff]
>>>> [    0.000000] Initmem setup node 0 [mem 0x0000000000000000-0x00000000ff6fffff]
>>>> [    0.000000] MMU: Allocated 1088 bytes of context maps for 255 contexts
>>>> [    0.000000] percpu: Embedded 11 pages/cpu s14196 r8192 d22668 u45056
>>>> [    0.000000] pcpu-alloc: s14196 r8192 d22668 u45056 alloc=11*4096
>>>> [    0.000000] pcpu-alloc: [0] 0 [0] 1
>>>> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 777792
>>>> [    0.000000] Kernel command line: root=ubi0:rootfs rootfstype=ubifs ubi.mtd=rootfs,2048 rootflags=chk_data_crc rw console=ttyS0,115200
>>>> [    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
>>>> [    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
>>>> [    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
>>>> [    0.000000] Kernel attempted to read user page (7df58) - exploit attempt? (uid: 0)
>>>> [    0.000000] BUG: Unable to handle kernel data access on read at 0x0007df58
>>>> [    0.000000] Faulting instruction address: 0xc01c8348
>>>> [    0.000000] Oops: Kernel access of bad area, sig: 11 [#1]
>>>> [    0.000000] BE PAGE_SIZE=4K SMP NR_CPUS=2 P2020RDB-PC
>>>> [    0.000000] Modules linked in:
>>>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.0.0-rc2-0caacb197b677410bdac81bc34f05235+ #121
>>>> [    0.000000] NIP:  c01c8348 LR: c01cb2bc CTR: 0000000a
>>>> [    0.000000] REGS: c10d7e20 TRAP: 0300   Not tainted  (6.0.0-rc2-0caacb197b677410bdac81bc34f05235+)
>>>> [    0.000000] MSR:  00021000 <CE,ME>  CR: 48044224  XER: 00000000
>>>> [    0.000000] DEAR: 0007df58 ESR: 00000000
>>>> [    0.000000] GPR00: c01cb294 c10d7f10 c1045340 00000001 00000004 c112bcc0 00000015 eedf1000
>>>> [    0.000000] GPR08: 00000003 0007df58 00000000 f0000000 28044228 00000200 00000000 00000000
>>>> [    0.000000] GPR16: 00000000 00000000 00000000 0275cb7a c0000000 00000001 0000075f 00000000
>>>> [    0.000000] GPR24: c1031004 00000000 00000000 00000001 c10f0000 eedf1000 00080000 00080000
>>>> [    0.000000] NIP [c01c8348] free_unref_page_prepare.part.93+0x48/0x60
>>>> [    0.000000] LR [c01cb2bc] free_unref_page+0x84/0x4b8
>>>> [    0.000000] Call Trace:
>>>> [    0.000000] [c10d7f10] [eedf1000] 0xeedf1000 (unreliable)
>>>> [    0.000000] [c10d7f20] [c01cb294] free_unref_page+0x5c/0x4b8
>>>> [    0.000000] [c10d7f70] [c1007644] mem_init+0xd0/0x194
>>>> [    0.000000] [c10d7fa0] [c1000e4c] start_kernel+0x4c0/0x6d0
>>>> [    0.000000] [c10d7ff0] [c00003e0] set_ivor+0x13c/0x178
>>>> [    0.000000] Instruction dump:
>>>> [    0.000000] 552817be 5509103a 7d294214 55293830 7d4a4a14 812a003c 814a0038 5529002a
>>>> [    0.000000] 7c892050 5484c23a 5489eafa 548406fe <7d2a482e> 7d242430 5484077e 90870010
>>>> [    0.000000] ---[ end trace 0000000000000000 ]---
>>>> [    0.000000]
>>>> [    0.000000] Kernel panic - not syncing: Fatal exception
>>>> [    0.000000] Rebooting in 1 seconds..
>>>> [    0.000000] System Halted, OK to turn off power
>>>>
>>>> 4GB DDR3 SODIMM module is set via Freescale LBC to the whole 4 GB
>>>> address range. And on ranges:
>>>> 0x0000_0000 - 0x7fff_ffff
>>>> 0xc020_0000 - 0xeeff_ffff
>>>> 0xf000_0000 - 0xff6f_ffff
>>>> there is no peripheral device, they are free for DRAM. Between these
>>>> physical ranges are mapped peripheral devices (PCIe and NOR).
>>>>
>>>> Any idea if I'm doing something wrong or there can be a bug in memory code?
>>>>
>>>> Quite suspicious is that "Initmem setup node 0" prints one range where
>>>> are also peripherals, not just DRAM. Crash is on address 0xc01c8348
>>>> which belongs to PCIe.
>>>>
>>>
>>> Yes I also find that "Initmem setup node 0" suspicious.
>>>
>>> However the crash address 0xc01c8348 is valid kernel address. That's a
>>> virtual address, not a physical address, so that's not PCIe. That's
>>> kernel linear mapping, so that's likely physical address 0x001c8348
>>> offseted by PAGE_OFFSET which is 0xc0000000.
>>
>> If I read the dump correctly, 0xc01c8348 is the PC of the instruction that
>> crashed and the access was to 0x0007df58 which seem to well inside
>> 0x0000_0000 - 0x7fff_ffff range.
> 
> I have tried to read and write memory at address 0x0007df58 in U-Boot
> and it works fine without any crash.

You are mixing physical and virtual addresses.

With Uboot, you checked the Physical address. That corresponds to 
address 0xc007df58 in Linux.

The Oops happens at virtual address 0x7df58 which is definitely invalid 
as it is below 0xc0000000.


> 
> I repeated that boot and it always failed with same errors at same
> address. I have also tried to use different 4GB DDR module (just in case
> if it is non-functional) but it failed on the same error on the same
> address.
> 
>> And the "Early memory node ranges" look consistent with the memory layout
>> above.
>>
>> My guess would be that something went wrong in the linear map setup, but it
>> won't hurt running with "memblock=debug" added to the kernel command line
>> to see if there is anything suspicious there.
> 
> Here is boot log on serial console with memblock=debug command line:
> 
...
> 
> Do you need something more for debug?

Can you send me the 'vmlinux' used to generate the above Oops so that I 
can see exactly where we are in function mem_init().

And could you also try without CONFIG_HIGHMEM just in case.


> 
>>> Do you have a way to reproduce this problem under QEMU ?
> 
> Well, I really do not know how to run it in QEMU. IIRC QEMU does not
> have support for P2020 processor. Is there any guidance?
> 

I don't know. I guess there might be the same problem with any e500. But 
as far as I can see, the 8544 emulation on QEMU has a limit of 3GB and 
provides memory as a single block.

Christophe


More information about the Linuxppc-dev mailing list