<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 25/12/2021 12:31, Nicholas Piggin
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:1640427851.k47q6y3qjb.astroid@bobo.none">
      <pre class="moz-quote-pre" wrap="">Excerpts from Stijn Tintel's message of December 22, 2021 11:20 am:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Hi,

After upgrading my Power8 server from 5.10 LTS to 5.15 LTS, I started
experiencing CPU hard lockups, usually rather quickly after boot:


watchdog: CPU 3 self-detected hard LOCKUP @
queued_spin_lock_slowpath+0x154/0x2d0
watchdog: CPU 3 TB:265651929071, last heartbeat TB:259344820187 (12318ms
ago)
</pre>
      </blockquote>
    </blockquote>
    snip<br>
    <blockquote type="cite"
      cite="mid:1640427851.k47q6y3qjb.astroid@bobo.none">
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Bisecting lead to the following commit:

deb9b13eb2571fbde164ae012c77985fd14f2f02 is the first bad commit
commit deb9b13eb2571fbde164ae012c77985fd14f2f02
Author: Davidlohr Bueso <a class="moz-txt-link-rfc2396E" href="mailto:dave@stgolabs.net"><dave@stgolabs.net></a>
Date:   Mon Mar 8 17:59:50 2021 -0800

   powerpc/qspinlock: Use generic smp_cond_load_relaxed
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">Thanks for bisecting and reporting this.</pre>
    </blockquote>
    Thanks for your response, much appreciated.
    <blockquote type="cite"
      cite="mid:1640427851.k47q6y3qjb.astroid@bobo.none">
      <pre class="moz-quote-pre" wrap="">As far as I can see, the code should be functionally identical,
the difference is slightly in loop structure and priority nops
but that shouldn't cause complete lock ups.

I suspect possibly something is getting miscompiled. What distro
do you use, what gcc version? And would you be able to send the
output of objdump --disassemble=queued_spin_lock_slowpath vmlinux
for your bad kernel?

</pre>
    </blockquote>
    <p>Gentoo hardened musl, both gcc 10.3.0 and 11.2.0 exhibit the
      lockups.</p>
    <p><span style="font-family:monospace"><span
          style="color:#000000;background-color:#ffffff;">/boot/disable/vmlinuz-5.12.0-rc3-ppc64le-00024-gdeb9b13eb257:
              file format elf64-powerpcle
        </span><br>
        <br>
        <br>
        Disassembly of section .head.text:
        <br>
        <br>
        Disassembly of section .text:
        <br>
        <br>
        c00000000010d0d4 <queued_spin_lock_slowpath>:
        <br>
        c00000000010d0d4:       e9 00 4c 3c     addis   r2,r12,233
        <br>
        c00000000010d0d8:       2c f3 42 38     addi    r2,r2,-3284
        <br>
        c00000000010d0dc:       00 01 04 28     cmplwi  r4,256
        <br>
        c00000000010d0e0:       2c 00 82 40     bne     c00000000010d10c
        <queued_spin_lock_slowpath+0x38>
        <br>
        c00000000010d0e4:       01 02 20 39     li      r9,513
        <br>
        c00000000010d0e8:       a6 03 29 7d     mtctr   r9
        <br>
        c00000000010d0ec:       02 00 83 e8     lwa     r4,0(r3)
        <br>
        c00000000010d0f0:       00 01 04 2c     cmpwi   r4,256
        <br>
        c00000000010d0f4:       14 00 82 40     bne     c00000000010d108
        <queued_spin_lock_slowpath+0x34>
        <br>
        c00000000010d0f8:       10 00 40 42     bdz     c00000000010d108
        <queued_spin_lock_slowpath+0x34>
        <br>
        c00000000010d0fc:       78 0b 21 7c     mr      r1,r1
        <br>
        c00000000010d100:       78 13 42 7c     mr      r2,r2
        <br>
        c00000000010d104:       e8 ff ff 4b     b       c00000000010d0ec
        <queued_spin_lock_slowpath+0x18>
        <br>
        c00000000010d108:       20 00 84 78     clrldi  r4,r4,32
        <br>
        c00000000010d10c:       2e 00 84 54     rlwinm  r4,r4,0,0,23
        <br>
        c00000000010d110:       00 00 04 2c     cmpwi   r4,0
        <br>
        c00000000010d114:       38 00 82 40     bne     c00000000010d14c
        <queued_spin_lock_slowpath+0x78>
        <br>
        c00000000010d118:       00 01 40 39     li      r10,256
        <br>
        c00000000010d11c:       28 18 20 7d     lwarx   r9,0,r3
        <br>
        c00000000010d120:       78 4b 48 7d     or      r8,r10,r9
        <br>
        c00000000010d124:       2d 19 00 7d     stwcx.  r8,0,r3
        <br>
        c00000000010d128:       f4 ff c2 40     bne-    c00000000010d11c
        <queued_spin_lock_slowpath+0x48>
        <br>
        c00000000010d12c:       2c 01 00 4c     isync
        <br>
        c00000000010d130:       2e 00 28 55     rlwinm  r8,r9,0,0,23
        <br>
        c00000000010d134:       20 00 2a 79     clrldi  r10,r9,32
        <br>
        c00000000010d138:       00 00 08 2c     cmpwi   r8,0
        <br>
        c00000000010d13c:       e4 00 82 41     beq     c00000000010d220
        <queued_spin_lock_slowpath+0x14c>
        <br>
        c00000000010d140:       00 ff 29 71     andi.   r9,r9,65280
        <br>
        c00000000010d144:       08 00 82 40     bne     c00000000010d14c
        <queued_spin_lock_slowpath+0x78>
        <br>
        c00000000010d148:       01 00 23 99     stb     r9,1(r3)
        <br>
        c00000000010d14c:       28 00 2d e9     ld      r9,40(r13)
        <br>
        c00000000010d150:       cf ff 42 3d     addis   r10,r2,-49
        <br>
        c00000000010d154:       01 00 00 39     li      r8,1
        <br>
        c00000000010d158:       80 15 4a 39     addi    r10,r10,5504
        <br>
        c00000000010d15c:       78 53 46 7d     mr      r6,r10
        <br>
        c00000000010d160:       14 4a c6 7c     add     r6,r6,r9
        <br>
        c00000000010d164:       0e 00 26 e9     lwa     r9,12(r6)
        <br>
        c00000000010d168:       01 00 e9 38     addi    r7,r9,1
        <br>
        c00000000010d16c:       03 00 09 2c     cmpwi   r9,3
        <br>
        c00000000010d170:       0c 00 e6 90     stw     r7,12(r6)
        <br>
        c00000000010d174:       00 00 ed a0     lhz     r7,0(r13)
        <br>
        c00000000010d178:       e4 00 81 41     bgt     c00000000010d25c
        <queued_spin_lock_slowpath+0x188>
        <br>
        c00000000010d17c:       e4 26 20 79     rldicr  r0,r9,4,59
        <br>
        c00000000010d180:       14 02 66 7d     add     r11,r6,r0
        <br>
        c00000000010d184:       00 00 00 39     li      r8,0
        <br>
        c00000000010d188:       08 00 0b 91     stw     r8,8(r11)
        <br>
        c00000000010d18c:       00 00 00 39     li      r8,0
        <br>
        c00000000010d190:       2a 01 06 7d     stdx    r8,r6,r0
        <br>
        c00000000010d194:       00 00 03 81     lwz     r8,0(r3)
        <br>
        c00000000010d198:       b5 07 08 7d     extsw.  r8,r8
        <br>
        c00000000010d19c:       04 01 82 41     beq     c00000000010d2a0
        <queued_spin_lock_slowpath+0x1cc>
        <br>
        c00000000010d1a0:       01 00 e7 38     addi    r7,r7,1
        <br>
        c00000000010d1a4:       1e 80 29 55     rlwinm  r9,r9,16,0,15
        <br>
        c00000000010d1a8:       f8 ff e1 fb     std     r31,-8(r1)
        <br>
        c00000000010d1ac:       1a 90 e7 54     rlwinm  r7,r7,18,0,13
        <br>
        c00000000010d1b0:       78 3b 27 7d     or      r7,r9,r7
        <br>
        c00000000010d1b4:       ac 04 20 7c     lwsync
        <br>
        c00000000010d1b8:       00 00 80 38     li      r4,0
        <br>
        c00000000010d1bc:       02 00 03 39     addi    r8,r3,2
        <br>
        c00000000010d1c0:       f8 1e 0c 55     rlwinm  r12,r8,3,27,28
        <br>
        c00000000010d1c4:       3e 84 e5 54     rlwinm  r5,r7,16,16,31
        <br>
        c00000000010d1c8:       ff ff 84 60     ori     r4,r4,65535
        <br>
        c00000000010d1cc:       64 07 08 79     rldicr  r8,r8,0,61
        <br>
        c00000000010d1d0:       30 60 a5 7c     slw     r5,r5,r12
        <br>
        c00000000010d1d4:       30 60 84 7c     slw     r4,r4,r12
        <br>
        c00000000010d1d8:       28 40 20 7d     lwarx   r9,0,r8
        <br>
        c00000000010d1dc:       78 20 3f 7d     andc    r31,r9,r4
        <br>
        c00000000010d1e0:       78 2b ff 7f     or      r31,r31,r5
        <br>
        c00000000010d1e4:       2d 41 e0 7f     stwcx.  r31,0,r8
        <br>
        c00000000010d1e8:       f0 ff c2 40     bne-    c00000000010d1d8
        <queued_spin_lock_slowpath+0x104>
        <br>
        c00000000010d1ec:       30 64 29 7d     srw     r9,r9,r12
        <br>
        c00000000010d1f0:       29 80 25 79     rldic.  r5,r9,16,32
        <br>
        c00000000010d1f4:       1e 80 28 55     rlwinm  r8,r9,16,0,15
        <br>
        c00000000010d1f8:       d0 00 82 40     bne     c00000000010d2c8
        <queued_spin_lock_slowpath+0x1f4>
        <br>
        c00000000010d1fc:       00 00 20 39     li      r9,0
        <br>
        c00000000010d200:       02 00 c3 e8     lwa     r6,0(r3)
        <br>
        c00000000010d204:       00 00 03 81     lwz     r8,0(r3)
        <br>
        c00000000010d208:       3e 04 05 55     clrlwi  r5,r8,16
        <br>
        c00000000010d20c:       00 00 05 2c     cmpwi   r5,0
        <br>
        c00000000010d210:       10 01 82 41     beq     c00000000010d320
        <queued_spin_lock_slowpath+0x24c>
        <br>
        c00000000010d214:       78 0b 21 7c     mr      r1,r1
        <br>
        c00000000010d218:       78 13 42 7c     mr      r2,r2
        <br>
        c00000000010d21c:       e4 ff ff 4b     b       c00000000010d200
        <queued_spin_lock_slowpath+0x12c>
        <br>
        c00000000010d220:       00 00 2a 2c     cmpdi   r10,0
        <br>
        c00000000010d224:       24 00 82 41     beq     c00000000010d248
        <queued_spin_lock_slowpath+0x174>
        <br>
        c00000000010d228:       02 00 23 e9     lwa     r9,0(r3)
        <br>
        c00000000010d22c:       3e 06 29 55     clrlwi  r9,r9,24
        <br>
        c00000000010d230:       00 00 09 2c     cmpwi   r9,0
        <br>
        c00000000010d234:       10 00 82 41     beq     c00000000010d244
        <queued_spin_lock_slowpath+0x170>
        <br>
        c00000000010d238:       78 0b 21 7c     mr      r1,r1
        <br>
        c00000000010d23c:       78 13 42 7c     mr      r2,r2
        <br>
        c00000000010d240:       e8 ff ff 4b     b       c00000000010d228
        <queued_spin_lock_slowpath+0x154>
        <br>
        c00000000010d244:       ac 04 20 7c     lwsync
        <br>
        c00000000010d248:       01 00 20 39     li      r9,1
        <br>
        c00000000010d24c:       00 00 23 b1     sth     r9,0(r3)
        <br>
        c00000000010d250:       20 00 80 4e     blr
        <br>
        c00000000010d254:       78 0b 21 7c     mr      r1,r1
        <br>
        c00000000010d258:       78 13 42 7c     mr      r2,r2
        <br>
        c00000000010d25c:       00 00 23 81     lwz     r9,0(r3)
        <br>
        c00000000010d260:       b5 07 29 7d     extsw.  r9,r9
        <br>
        c00000000010d264:       f0 ff 82 40     bne     c00000000010d254
        <queued_spin_lock_slowpath+0x180>
        <br>
        c00000000010d268:       28 18 e0 7c     lwarx   r7,0,r3
        <br>
        c00000000010d26c:       00 48 07 7c     cmpw    r7,r9
        <br>
        c00000000010d270:       10 00 c2 40     bne-    c00000000010d280
        <queued_spin_lock_slowpath+0x1ac>
        <br>
        c00000000010d274:       2d 19 00 7d     stwcx.  r8,0,r3
        <br>
        c00000000010d278:       f0 ff c2 40     bne-    c00000000010d268
        <queued_spin_lock_slowpath+0x194>
        <br>
        c00000000010d27c:       2c 01 00 4c     isync
        <br>
        c00000000010d280:       00 00 07 2c     cmpwi   r7,0
        <br>
        c00000000010d284:       d0 ff 82 40     bne     c00000000010d254
        <queued_spin_lock_slowpath+0x180>
        <br>
        c00000000010d288:       28 00 0d e9     ld      r8,40(r13)
        <br>
        c00000000010d28c:       0c 00 2a 39     addi    r9,r10,12
        <br>
        c00000000010d290:       2e 40 49 7d     lwzx    r10,r9,r8
        <br>
        c00000000010d294:       ff ff 4a 39     addi    r10,r10,-1
        <br>
        c00000000010d298:       2e 41 49 7d     stwx    r10,r9,r8
        <br>
        c00000000010d29c:       20 00 80 4e     blr
        <br>
        c00000000010d2a0:       01 00 a0 38     li      r5,1
        <br>
        c00000000010d2a4:       28 18 80 7c     lwarx   r4,0,r3
        <br>
        c00000000010d2a8:       00 40 04 7c     cmpw    r4,r8
        <br>
        c00000000010d2ac:       10 00 c2 40     bne-    c00000000010d2bc
        <queued_spin_lock_slowpath+0x1e8>
        <br>
        c00000000010d2b0:       2d 19 a0 7c     stwcx.  r5,0,r3
        <br>
        c00000000010d2b4:       f0 ff c2 40     bne-    c00000000010d2a4
        <queued_spin_lock_slowpath+0x1d0>
        <br>
        c00000000010d2b8:       2c 01 00 4c     isync
        <br>
        c00000000010d2bc:       00 00 04 2c     cmpwi   r4,0
        <br>
        c00000000010d2c0:       c8 ff 82 41     beq     c00000000010d288
        <queued_spin_lock_slowpath+0x1b4>
        <br>
        c00000000010d2c4:       dc fe ff 4b     b       c00000000010d1a0
        <queued_spin_lock_slowpath+0xcc>
        <br>
        c00000000010d2c8:       be 74 08 55     rlwinm  r8,r8,14,18,31
        <br>
        c00000000010d2cc:       04 00 a2 3c     addis   r5,r2,4
        <br>
        c00000000010d2d0:       f0 cf a5 38     addi    r5,r5,-12304
        <br>
        c00000000010d2d4:       a8 26 29 79     rldic   r9,r9,4,58
        <br>
        c00000000010d2d8:       ff ff 08 39     addi    r8,r8,-1
        <br>
        c00000000010d2dc:       14 4a 2a 7d     add     r9,r10,r9
        <br>
        c00000000010d2e0:       b4 07 08 7d     extsw   r8,r8
        <br>
        c00000000010d2e4:       24 1f 08 79     rldicr  r8,r8,3,60
        <br>
        c00000000010d2e8:       2a 40 05 7d     ldx     r8,r5,r8
        <br>
        c00000000010d2ec:       2a 41 69 7d     stdx    r11,r9,r8
        <br>
        c00000000010d2f0:       0a 00 2b e9     lwa     r9,8(r11)
        <br>
        c00000000010d2f4:       00 00 29 2c     cmpdi   r9,0
        <br>
        c00000000010d2f8:       10 00 82 40     bne     c00000000010d308
        <queued_spin_lock_slowpath+0x234>
        <br>
        c00000000010d2fc:       78 0b 21 7c     mr      r1,r1
        <br>
        c00000000010d300:       78 13 42 7c     mr      r2,r2
        <br>
        c00000000010d304:       ec ff ff 4b     b       c00000000010d2f0
        <queued_spin_lock_slowpath+0x21c>
        <br>
        c00000000010d308:       ac 04 20 7c     lwsync
        <br>
        c00000000010d30c:       2a 00 26 7d     ldx     r9,r6,r0
        <br>
        c00000000010d310:       00 00 29 2c     cmpdi   r9,0
        <br>
        c00000000010d314:       ec fe 82 41     beq     c00000000010d200
        <queued_spin_lock_slowpath+0x12c>
        <br>
        c00000000010d318:       ec 49 00 7c     dcbtstct 0,r9
        <br>
        c00000000010d31c:       e4 fe ff 4b     b       c00000000010d200
        <queued_spin_lock_slowpath+0x12c>
        <br>
        c00000000010d320:       ac 04 20 7c     lwsync
        <br>
        c00000000010d324:       1e 00 08 55     rlwinm  r8,r8,0,0,15
        <br>
        c00000000010d328:       00 38 08 7c     cmpw    r8,r7
        <br>
        c00000000010d32c:       2c 00 82 41     beq     c00000000010d358
        <queued_spin_lock_slowpath+0x284>
        <br>
        c00000000010d330:       00 00 29 2c     cmpdi   r9,0
        <br>
        c00000000010d334:       01 00 00 39     li      r8,1
        <br>
        c00000000010d338:       00 00 03 99     stb     r8,0(r3)
        <br>
        c00000000010d33c:       58 00 82 40     bne     c00000000010d394
        <queued_spin_lock_slowpath+0x2c0>
        <br>
        c00000000010d340:       00 00 2b e9     ld      r9,0(r11)
        <br>
        c00000000010d344:       00 00 29 2c     cmpdi   r9,0
        <br>
        c00000000010d348:       4c 00 82 40     bne     c00000000010d394
        <queued_spin_lock_slowpath+0x2c0>
        <br>
        c00000000010d34c:       78 0b 21 7c     mr      r1,r1
        <br>
        c00000000010d350:       78 13 42 7c     mr      r2,r2
        <br>
        c00000000010d354:       ec ff ff 4b     b       c00000000010d340
        <queued_spin_lock_slowpath+0x26c>
        <br>
        c00000000010d358:       01 00 00 39     li      r8,1
        <br>
        c00000000010d35c:       28 18 e0 7c     lwarx   r7,0,r3
        <br>
        c00000000010d360:       00 30 07 7c     cmpw    r7,r6
        <br>
        c00000000010d364:       0c 00 c2 40     bne-    c00000000010d370
        <queued_spin_lock_slowpath+0x29c>
        <br>
        c00000000010d368:       2d 19 00 7d     stwcx.  r8,0,r3
        <br>
        c00000000010d36c:       f0 ff c2 40     bne-    c00000000010d35c
        <queued_spin_lock_slowpath+0x288>
        <br>
        c00000000010d370:       00 38 06 7c     cmpw    r6,r7
        <br>
        c00000000010d374:       bc ff 82 40     bne     c00000000010d330
        <queued_spin_lock_slowpath+0x25c>
        <br>
        c00000000010d378:       28 00 0d e9     ld      r8,40(r13)
        <br>
        c00000000010d37c:       0c 00 2a 39     addi    r9,r10,12
        <br>
        c00000000010d380:       2e 40 49 7d     lwzx    r10,r9,r8
        <br>
        c00000000010d384:       ff ff 4a 39     addi    r10,r10,-1
        <br>
        c00000000010d388:       2e 41 49 7d     stwx    r10,r9,r8
        <br>
        c00000000010d38c:       f8 ff e1 eb     ld      r31,-8(r1)
        <br>
        c00000000010d390:       20 00 80 4e     blr
        <br>
        c00000000010d394:       ac 04 20 7c     lwsync
        <br>
        c00000000010d398:       01 00 00 39     li      r8,1
        <br>
        c00000000010d39c:       08 00 09 91     stw     r8,8(r9)
        <br>
        c00000000010d3a0:       d8 ff ff 4b     b       c00000000010d378
        <queued_spin_lock_slowpath+0x2a4>
        <br>
        <br>
        Disassembly of section .init.text:
        <br>
        <br>
        Disassembly of section .exit.text:<br>
      </span></p>
    <blockquote type="cite"
      cite="mid:1640427851.k47q6y3qjb.astroid@bobo.none">
      <pre class="moz-quote-pre" wrap="">

</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">   

The problem persists in 2f47a9a4dfa3674fad19a49b40c5103a9a8e1589 and
goes away if I revert deb9b13eb2571fbde164ae012c77985fd14f2f02 on top of
that. As deb9b13eb2571fbde164ae012c77985fd14f2f02 seems to be a revert
of 49a7d46a06c30c7beabbf9d1a8ea1de0f9e4fdfe, I suspect this problem
might have existed before 49a7d46a06c30c7beabbf9d1a8ea1de0f9e4fdfe. I
therefore tried to build 49a7d46a06c30c7beabbf9d1a8ea1de0f9e4fdfe and
49a7d46a06c30c7beabbf9d1a8ea1de0f9e4fdfe^1 to verify if the problem
exists there as well, unfortunately these commits don't build due to the
following compile error:

kernel/smp.c:In function 'smp_init':
./include/linux/compiler.h:392:38:error: call to
'__compiletime_assert_150' declared with attribute error: BUILD_BUG_ON
failed: offsetof(struct task_struct, wake_entry_type) - offsetof(struct
task_struct, wake_entry) != offsetof(struct __call_single_data, flags) -
offsetof(struct __call_single_data, llist)
 392 |  _compiletime_assert(condition, msg, __compiletime_assert_,
__COUNTER__)
     |                                      ^

</pre>
      </blockquote>
    </blockquote>
    <p> Switching from gcc 10.3.0 to gcc 11.2.0 made the above compile
      error go away, and as expected,
      49a7d46a06c30c7beabbf9d1a8ea1de0f9e4fdfe boots fine and
      49a7d46a06c30c7beabbf9d1a8ea1de0f9e4fdfe^1 exhibits the same
      problem. I started bisecting the 2nd part but I'll pause that
      effort for now.</p>
    <p>Thanks,<br>
      Stijn<br>
    </p>
  </body>
</html>