[PATCH] Revert "powerpc: Switch to relative jump labels"

Greg Kurz groug at kaod.org
Tue Jun 8 03:03:43 AEST 2021


On Tue, 01 Jun 2021 17:36:15 +1000
Michael Ellerman <mpe at ellerman.id.au> wrote:

> Roman Bolshakov <r.bolshakov at yadro.com> writes:
> > On Sat, May 29, 2021 at 09:39:49AM +1000, Michael Ellerman wrote:
> >> Roman Bolshakov <r.bolshakov at yadro.com> writes:
> >> > This reverts commit b0b3b2c78ec075cec4721986a95abbbac8c3da4f.
> >> >
> >> > Otherwise, direct kernel boot with initramfs no longer works in QEMU.
> >> > It's broken in some bizarre way because a valid initramfs is not
> >> > recognized anymore:
> >> >
> >> >   Found initrd at 0xc000000001f70000:0xc000000003d61d64
> >> >   rootfs image is not initramfs (XZ-compressed data is corrupt); looks like an initrd
> >> >
> >> > The issue is observed on v5.13-rc3 if the kernel is built with
> >> > defconfig, GCC 7.5.0 and GNU ld 2.32.0.
> >> 
> >> Are you able to try a different compiler?
> >
> > Hi Michael,
> >
> > I've just tried GCC 9.3.1 and the result is the same.
> >
> > The offending patch has assembly inlines, they typically go through
> > binutils/GAS and it might also be a case when older binutils doesn't
> > implement something properly (i've seen this on x86 and arm).
> 
> Jump labels use asm goto, which is a compiler feature, but you're right
> that the binutils version could also be important.
> 
> What ld versions have you tried?
> 
> And are those the toolchains from kernel.org or somewhere else?
> 
> >> I test booting qemu constantly, but I don't use GCC 7.5.
> >>
> >> And what qemu version are you using?
> >> 
> >
> > QEMU 3.1.1, but I've also tried 6.0.50 (QEMU master, 62c0ac5041e913) and
> > it fails the same way.
> 
> OK.
> 
> >> I assume your initramfs is compressed with XZ? How large is it
> >> compressed?
> >> 
> >
> > Yes, XZ. initramfs size is 30 MB (around 100 MB cpio size).
> >
> > It's interesting that the issue doesn't happen if I pass initramfs from
> > host (11MB), then the initramfs can be recognized. It might be related
> > to initramfs size then and bigger initramfs that used to work no longer
> > work with v5.13-rc3.
> 
> Are you using qemu's -initrd option to pass the initramfs, or are you
> building the initramfs into the kernel?
> 

Hi Michael,

I'm hitting the same issue while trying to boot a RHEL9 guest with
the distro's default kernel/initramfs and grub.

Interestingly this doesn't happen with older QEMU, e.g. 4.2.0 that
is shipped with RHEL8. I've bissected to this commit from the
QEMU 5.0 era :


commit 8897ea5a9fc0aafa5ed7eee1e0c49893b91a2d87
Author: David Gibson <david at gibson.dropbear.id.au>
Date:   Thu Nov 28 16:37:04 2019 +1100

    spapr: Don't attempt to clamp RMA to VRMA constraint


This mostly changes how memory is presented in the FDT.

Before 8897ea5a9fc, for a VM with 1 gig of RAM, we had several nodes,
first one being the VRMA (limited to 256 megs).

        memory at 20000000 {
                ibm,associativity = <0x04 0x00 0x00 0x00 0x00>;
                reg = <0x00 0x20000000 0x00 0x20000000>;
                device_type = "memory";
        };

        memory at 10000000 {
                ibm,associativity = <0x04 0x00 0x00 0x00 0x00>;
                reg = <0x00 0x10000000 0x00 0x10000000>;
                device_type = "memory";
        };

        memory at 0 {
                ibm,associativity = <0x04 0x00 0x00 0x00 0x00>;
                reg = <0x00 0x00 0x00 0x10000000>;
                device_type = "memory";
        };


Now we have a single node for all RAM:

        memory at 0 {
                ibm,associativity = <0x04 0x00 0x00 0x00 0x00>;
                reg = <0x00 0x00 0x00 0x40000000>;
                device_type = "memory";
        };

If I set an arbitrary constraint again on the VRMA, I get the
multiple memory nodes back and, depending on the value, the
boot succeeds. In my 1 gig RHEL9 guest case, I need to set
a VRMA size <= 0x32000000.

Not sure how this can relate to the initramfs though. I just see
that grub doens't map it at the same place:

0x0000000003100000 when boot fails

0x000000000f000000 when boot succeeds

In case this rings a bell...

> > So, I've created a small initramfs using only static busybox (2.7M
> > uncompressed, 960K compressed with xz). No error is produced and it
> > boots fine.
> >
> > If I add a dummy file (11M off /dev/urandom) to the small busybox
> > initramfs, it boots and the init is started but I'm seeing the error:
> >
> >   rootfs image is not initramfs (XZ-compressed data is corrupt); looks like an initrd
> >
> > sha1sum of the file inside initramfs doesn't match sha1sum on the host.
> >
> >   guest # sha1sum dummy
> >   407c347e671ddd00f69df12b3368048bad0ebf0c  dummy
> >   # QEMU: Terminated
> >   host $ sha1sum dummy
> >   ed8494b3eecab804960ceba2c497270eed0b0cd1  dummy
> >
> > sha1sum is the same in the guest and on the host for 10M dummy file:
> >
> >   guest # sha1sum dummy
> >   43855f7a772a28cce91da9eb8f86f53bc807631f  dummy
> >   # QEMU: Terminated
> >   host $ sha1sum dummy
> >   43855f7a772a28cce91da9eb8f86f53bc807631f  dummy
> >
> > That might explain why bigger initramfs (or initramfs with bigger files)
> > doesn't boot - because some files might appear corrupted inside the guest.
> >
> > Here're the sources of the initrd along with 11M dummy file:
> >   https://drive.yadro.com/s/W8HdbPnaKmPPwK4
> >
> > I've compressed it with:
> >   $ find . 2>/dev/null | cpio -ocR 0:0 | xz  --check=crc32 > ../initrd-dummy.xz
> >
> > Hope this helps,
> 
> I haven't been able to reproduce any corruption, with various initramfs
> sizes.
> 
> Can you send us your kernel .config & qemu command line.
> 
> And can you try the patch below?
> 
> cheers
> 
> 
> diff --git a/arch/powerpc/kernel/jump_label.c b/arch/powerpc/kernel/jump_label.c
> index ce87dc5ea23c..3d9878124cde 100644
> --- a/arch/powerpc/kernel/jump_label.c
> +++ b/arch/powerpc/kernel/jump_label.c
> @@ -13,6 +13,9 @@ void arch_jump_label_transform(struct jump_entry *entry,
>  {
>  	struct ppc_inst *addr = (struct ppc_inst *)jump_entry_code(entry);
>  
> +	if (!is_kernel_text((unsigned long)addr) && !is_kernel_inittext((unsigned long)addr))
> +		printk("%s: addr %px %pS is not kernel text?\n", __func__, addr, addr);
> +

I've applied this too. It doesn't produce any output in the crashing case.
On the contrary I get tons of them when I run with the hacked VRMA size,
but they show up much later, after we've already freed the initrd memory.

Cheers,

--
Greg

>  	if (type == JUMP_LABEL_JMP)
>  		patch_branch(addr, jump_entry_target(entry), 0);
>  	else



More information about the Linuxppc-dev mailing list