rcutorture’s init segfaults in ppc64le VM

Michael Ellerman mpe at ellerman.id.au
Tue Feb 8 21:09:17 AEDT 2022


Paul Menzel <pmenzel at molgen.mpg.de> writes:
> Dear Linux folks,

Hi Paul,

> On the POWER8 server IBM S822LC running Ubuntu 21.10, building Linux 
> 5.17-rc2+ with rcutorture tests

I'm not sure if that's the host kernel version or the version you're
using of rcutorture? Can you tell us the sha1 of your host kernel and of
the tree you're running rcutorture from?

>      $ tools/testing/selftests/rcutorture/bin/torture.sh --duration 10
>
> the built init
>
>      $ file tools/testing/selftests/rcutorture/initrd/init
>      tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB 
> executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically 
> linked, BuildID[sha1]=0ded0e45649184a296f30d611f7a03cc51ecb616, for 
> GNU/Linux 3.10.0, stripped

Mine looks pretty much identical:

  $ file tools/testing/selftests/rcutorture/initrd/init
  tools/testing/selftests/rcutorture/initrd/init: ELF 64-bit LSB
  executable, 64-bit PowerPC or cisco 7500, version 1 (SYSV), statically
  linked, BuildID[sha1]=86078bf6e5d54ab0860d36aa9a65d52818b972c8, for
  GNU/Linux 3.10.0, stripped


> segfaults in QEMU. From one of the log files

But mine doesn't segfault, it runs fine and the test completes.

What qemu version are you using?

I tried 4.2.1 and 6.2.0, both worked.


> /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-rcutorture/TREE03/console.log
>
>      [    1.119803][    T1] Run /init as init process
>      [    1.122011][    T1] init[1]: segfault (11) at f0656d90 nip 10000a18 lr 0 code 1 in init[10000000+d0000]
>      [    1.124863][    T1] init[1]: code: 2c2903e7 f9210030 4081ff84 4bffff58 00000000 01000000 00000580 3c40100f
>      [    1.128823][    T1] init[1]: code: 38427c00 7c290b78 782106e4 38000000 <f821ff81> 7c0803a6 f8010000 e9028010

The disassembly from 3c40100f is:
  lis     r2,4111
  addi    r2,r2,31744
  mr      r9,r1
  rldicr  r1,r1,0,59
  li      r0,0
  stdu    r1,-128(r1)		<- fault
  mtlr    r0
  std     r0,0(r1)
  ld      r8,-32752(r2)


I think you'll find that's the code at the ELF entry point. You can
check with:

 $ readelf -e tools/testing/selftests/rcutorture/initrd/init | grep Entry
   Entry point address:               0x10000c0c

 $ objdump -d tools/testing/selftests/rcutorture/initrd/init | grep -m 1 -A 8 10000c0c
    10000c0c:   0e 10 40 3c     lis     r2,4110
    10000c10:   00 7b 42 38     addi    r2,r2,31488
    10000c14:   78 0b 29 7c     mr      r9,r1
    10000c18:   e4 06 21 78     rldicr  r1,r1,0,59
    10000c1c:   00 00 00 38     li      r0,0
    10000c20:   81 ff 21 f8     stdu    r1,-128(r1)
    10000c24:   a6 03 08 7c     mtlr    r0
    10000c28:   00 00 01 f8     std     r0,0(r1)
    10000c2c:   10 80 02 e9     ld      r8,-32752(r2)


The fault you're seeing is the first store using the stack pointer (r1),
which is setup by the kernel.

The fault address f0656d90 is weirdly low, the stack should be up near 128TB.

I'm not sure how we end up with a bad r1.

Can you dump some info about the kernel that was built, something like:

$ file /dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-rcutorture/TREE03/vmlinux

And maybe paste/attach the full log, maybe there's a clue somewhere.

cheers


More information about the Linuxppc-dev mailing list