Re: [Help] Microwatt (Zynqwatt) — Kernel halts after Radix MMU init on booting Linux on Zynq version of Microwatt

Sat Nov 22 13:48:12 AEDT 2025

On Sat, Nov 22, 2025 at 1:59 AM Mohammad Amin Nili
<manili.devteam at gmail.com> wrote:
>
> The problem with the *first* (early) path is that a udbg driver
> often calls `early_ioremap()` **before** any MMU/memory
> setup has completed. The MMU setup for 64-bit happens here:
> - https://elixir.bootlin.com/linux/v6.18-rc5/source/arch/powerpc/kernel/setup_64.c#L418
>
> The 32-bit setup (`setup_32.c`) calls `udbg_early_init` *after*
> `early_ioremap_init`, so it doesn’t hit this ordering problem:
> - https://elixir.bootlin.com/linux/v6.18-rc5/source/arch/powerpc/kernel/setup_32.c#L87
>
> In my case, attempting to initialize the Xilinx udbg driver via
> the early path can (and did) cause kernel panics because
> `early_ioremap()` runs before the memory/MMU is ready.
> Practically, the only reliable place for my driver to initialize
> is later in `init/main.c`. Is this the intended/expected behavior
> for PPC64? Or am I missing something?

Nah it's just broken for your case. Most ppc64 development happens on
IBM hardware and the udbg drivers for those platforms (pseries for
VMs, powernv for bare metal) don't require the MMU to be setup.
Enabling debug output pre-MMU is handy since it lets you debug MMU
setup issues, but obviously it's not going to work if your udbg driver
needs to map stuff. You could avoid depending on the MMU by using the
potato uart driver's trick of just disabling data relocation when
writing to the UART's registers. That's pretty slow and a bit janky
though. A better fix might be to split udbg init into an early
(pre-mmu) and late variants with the udbg drivers that need the mmu
are initialised later.

> 2. I had to inject a busy-wait loop between lines 125–126 in
> `kernel/rcu/tiny.c` to prevent a crash when/after switching to
> userspace:
> - https://elixir.bootlin.com/linux/v6.18-rc5/source/kernel/rcu/tiny.c#L125
>
> Here is the loop I added:
> for (volatile uint32_t i = 0; i < 10; i++);
>
> If I omit that trivial busy-wait, the kernel crashes while/after
> switching to userspace with an error LIKE:
>
> [   42.397074] kernel tried to execute exec-protected page (c00c000000000000) - exploit attempt? (uid: 0)
> [   42.408148] BUG: Unable to handle kernel instruction fetch
> [   42.414964] Faulting instruction address: 0xc00c000000000000
> Vector: 400 (Instruction Access) at [c00000000207fae0]
>     pc: c00c000000000000
>     lr: c00000000008c798: rcu_process_callbacks+0xf8/0x100
>     sp: c00000000207fd80
>    msr: 900000001000b033
>   current = 0xc000000002056300
>   paca    = 0xc0000000016e8000 irqmask: 0x03 irq_happened: 0x09
>     pid   = 10, comm = ksoftirqd/0
> Linux version 6.18.0-rc5-00111-g6fa9041b7177-dirty (manili at manili) (powerpc64le-linux-gcc.br_real (Buildroot 2021.11-18033-g83947c7bb6) 14.3.0, GNU ld (GNU Binutils) 2.43.1) #3 Thu Nov 20 09:33:11 EST 2025
> enter ? for help
> [link register   ] c00000000008c798 rcu_process_callbacks+0xf8/0x100
> [c00000000207fd80] c00000000008c748 rcu_process_callbacks+0xa8/0x100 (unreliable)
> [c00000000207fe00] c00000000003f320 handle_softirqs+0x1ec/0x23c
> [c00000000207ff00] c00000000003f3a8 run_ksoftirqd+0x38/0x58
> [c00000000207ff20] c00000000005f9c4 smpboot_thread_fn+0x1a0/0x1a8
> [c00000000207ff80] c00000000005b190 kthread+0x1c0/0x1cc
> [c00000000207ffe0] c00000000000b160 start_kernel_thread+0x14/0x18
> mon>
>
> The exact addresses in the error vary, but the crash
> template is the same. My suspicion is that this is a
> core/thread synchronization issue. Do you have any
> ideas on this issue and why a simple while loop is able
> to solve it?

That's very odd. rcu_reclaim_tiny() is probably being folded into
rcu_process_callbacks() by the compiler and the crash occurs when
branching to the callback function from the rcu_head
(https://elixir.bootlin.com/linux/v6.18-rc5/source/kernel/rcu/tiny.c#L95).
That said, the "callback" address it branched to (0xc00c000000000000)
is actually the base of the vmemmap (i.e. the struct page array) so I
doubt that's actually the callback address stored in the rcu_head. You
can use xmon to dump the registers and examine memory to confirm this.
It's hard to say why this is happening, but it's pretty likely to
either be the compiler optimizing away code you'd prefer to keep or a
bug in the core itself.

I'd compare the disasm of rcu_process_callbacks() with and without
your wait loop added and see how the emitted code changes. If adding
the loop changes nothing then it might be a logic bug in microwatt
itself or some other timing induced problem.

>
> Bests,
> Manili