<div dir="ltr"><div dir="ltr">On Thu, Nov 5, 2020 at 2:19 AM Michael Ellerman <<a href="mailto:mpe@ellerman.id.au">mpe@ellerman.id.au</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Carl Jacobsen <<a href="mailto:cjacobsen@storix.com" target="_blank">cjacobsen@storix.com</a>> writes:<br>
> The panic (on a call to malloc from static linked libcrypto) looks like<br>
> this:<br><br>
What hardware is this on?<br></blockquote><div><br></div><div>Thank you for looking into this.</div><div><br></div><div>The system that's panicking identifies like this:<br>    # uname -a<br>    Linux sl151pwr8 4.12.14-197.18-default #1 SMP Tue Sep 17 14:26:49 UTC 2019<br>    (d75059b) ppc64le ppc64le ppc64le GNU/Linux<br>    #<br>    # cat /etc/os-release<br>    NAME="SLES"<br>    VERSION="15-SP1"<br>    VERSION_ID="15.1"<br>    PRETTY_NAME="SUSE Linux Enterprise Server 15 SP1"<br>    ID="sles"<br>    ID_LIKE="suse"<br>    ANSI_COLOR="0;32"<br>    CPE_NAME="cpe:/o:suse:sles:15:sp1"<br><br>The system is an LPAR running under PowerVM vios version 2.2.3.4.<br>The underlying hardware is machine type-model 8284-22A.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Can you try booting with ppc_tm=off on the kernel command line, and see<br>
if that changes anything?<br></blockquote><div><br></div><div>Yes. Output is down below. Doesn't appear to change much, but I don't have</div><div>the background to interpret the registers.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Can you put your compiled test program up somewhere we can download it<br>
and look at? Or post the disassembly?<br></blockquote><div><br></div><div>Here's the source file:<br>    <a href="https://www.storix.com/download/support/misc/rand_test.c">https://www.storix.com/download/support/misc/rand_test.c</a><br><br>Here's the resulting executable:<br>    <a href="https://www.storix.com/download/support/misc/rand_test">https://www.storix.com/download/support/misc/rand_test</a><br></div><div><br></div><div>Executable is linked to libcrypto from openssl-1.1.1g, configured with:<br>    ./config no-shared no-dso no-threads -fPIC -ggdb3 -debug -static<br><br>Executable is built (on SUSE 12) with:<br>    gcc -ggdb3 -o rand_test rand_test.c libcrypto.a<br><br><br>And running the executable (on SUSE 15.1) through gdb goes like this:<br><br>    # gdb --args ./rand_test</div><div>    GNU gdb (GDB; SUSE Linux Enterprise 15) 8.3.1</div><div>    << snip intro text >></div><div>    Reading symbols from ./rand_test...<br>    (gdb) b main<br>    Breakpoint 1 at 0x1000288c: file rand_test.c, line 6.<br>    (gdb) r<br>    Starting program: /tmp/ossl/rand_test <br><br>    Breakpoint 1, main (argc=1, argv=0x7ffffffff798) at rand_test.c:6<br>    6           int has_enough_data = RAND_status();<br>    (gdb) s<br>    RAND_status () at crypto/rand/rand_lib.c:958<br>    958         const RAND_METHOD *meth = RAND_get_rand_method();<br>    (gdb) <br>    RAND_get_rand_method () at crypto/rand/rand_lib.c:844<br>    844         const RAND_METHOD *tmp_meth = NULL;<br>    (gdb) <br>    846         if (!RUN_ONCE(&rand_init, do_rand_init))<br>    (gdb) <br>    CRYPTO_THREAD_run_once (once=0x102a7d88 <rand_init>, <br>     init=0x10002f30 <do_rand_init_ossl_>) at crypto/threads_none.c:67<br>    67          if (*once != 0)<br>    (gdb) <br>    70          init();<br>    (gdb) <br>    do_rand_init_ossl_ () at crypto/rand/rand_lib.c:306<br>    306     DEFINE_RUN_ONCE_STATIC(do_rand_init)<br>    (gdb) <br>    do_rand_init () at crypto/rand/rand_lib.c:309<br>    309         rand_engine_lock = CRYPTO_THREAD_lock_new();<br>    (gdb) <br>    CRYPTO_THREAD_lock_new () at crypto/threads_none.c:24<br>    24          if ((lock = OPENSSL_zalloc(sizeof(unsigned int))) == NULL) {<br>    (gdb) <br>    CRYPTO_zalloc (num=4, file=0x1023a500 "crypto/threads_none.c", line=24)<br>    at crypto/mem.c:230<br>    230         void *ret = CRYPTO_malloc(num, file, line);<br>    (gdb) <br>    CRYPTO_malloc (num=4, file=0x1023a500 "crypto/threads_none.c", line=24)<br> at crypto/mem.c:194<br>    194         void *ret = NULL;<br>    (gdb) <br>    197         if (malloc_impl != NULL && malloc_impl != CRYPTO_malloc)<br>    (gdb) <br>    200         if (num == 0)<br>    (gdb) <br>    204         if (allow_customize) {<br>    (gdb) <br>    210             allow_customize = 0;<br>    (gdb) <br>    222         ret = malloc(num);<br>    (gdb) <br>    Bad kernel stack pointer 7fffffffef20 at 700<br>    Oops: Bad kernel stack pointer, sig: 6 [#1]<br>    SMP NR_CPUS=2048 <br>    NUMA <br>    pSeries<br>    Modules linked in: scsi_transport_iscsi af_packet xt_tcpudp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute br_netfilter bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ibmveth(X) vmx_crypto gf128mul crct10dif_vpmsum rtc_generic btrfs xor zstd_decompress zstd_compress xxhash raid6_pq sr_mod cdrom sd_mod ibmvscsi(X) scsi_transport_srp crc32c_vpmsum sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4<br>    Supported: Yes, External<br>    CPU: 4 PID: 3082 Comm: rand_test Tainted: G                   4.12.14-197.18-default #1 SLE15-SP1<br>    task: c00000002e226100 task.stack: c0000000387c8000<br>    NIP: 0000000000000700 LR: 0000000010004acc CTR: 0000000000000000<br>    REGS: c00000001ebffd40 TRAP: 0300   Tainted: G                    (4.12.14-197.18-default)<br>    MSR: 8000000000001000 <SF,ME><br>      CR: 44000844  XER: 20000000<br>    CFAR: 00000000000010f0 DAR: ffffffffffffb27a DSISR: 40000000 SOFTE: 0 <br>    GPR00: 0000000020000000 00007fffffffef20 00000000102af788 fffffffffffffffd <br>    GPR04: 0000000000000020 0000000000000030 00000000102b0760 0000000000000001 <br>    GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730 800000010280f033 <br>    GPR12: 0000000000004000 00007fffb7ffa100 0000000000000000 0000000000000000 <br>    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 <br>    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 <br>    GPR24: 0000000000000000 0000000000000000 0000000000000000 00007fffb7fef4b8 <br>    GPR28: 00007fffb7ff0000 0000000000000000 0000000000000000 00007fffffffef20 <br>    NIP [0000000000000700] 0x700<br>    LR [0000000010004acc] 0x10004acc<br>    Call Trace:<br>    Instruction dump:<br>    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <br>    00000000 00000000 00000000 00000000 7db243a6 7db142a6 f92d0080 7d20e2a6 <br>    ---[ end trace 167d5d3b2e8a06e9 ]---<br><br>    Sending IPI to other CPUs<br>    IPI complete<br>    kexec: Starting switchover sequence.<br>    I'm in purgatory<br>     -> smp_release_cpus()<br>    spinning_secondaries = 0<br>     <- smp_release_cpus()<br>    Kernel panic - not syncing: Out of memory and no killable processes...<br><br>    CPU: 4 PID: 1 Comm: swapper/4 Not tainted 4.12.14-197.18-default #1 SLE15-SP1<br>    Call Trace:<br>    [c000000012457210] [c000000008a20140] dump_stack+0xb0/0xf0 (unreliable)<br>    [c000000012457250] [c000000008a1ccd4] panic+0x144/0x31c<br>    [c0000000124572e0] [c0000000082efcc0] out_of_memory+0x3f0/0x700<br>    [c000000012457380] [c0000000082f7ed4] __alloc_pages_nodemask+0x1004/0x10b0<br>    [c000000012457570] [c00000000837f4d8] alloc_page_interleave+0x58/0x110<br>    [c0000000124575b0] [c0000000083800bc] alloc_pages_current+0x16c/0x1d0<br>    [c000000012457610] [c0000000082e8398] __page_cache_alloc+0xd8/0x150<br>    [c000000012457650] [c0000000082e8574] pagecache_get_page+0x164/0x440<br>    [c0000000124576b0] [c0000000082e8884] grab_cache_page_write_begin+0x34/0x70<br>    [c0000000124576e0] [c00000000840ede8] simple_write_begin+0x48/0x190<br>    [c000000012457720] [c0000000082e7c7c] generic_perform_write+0xec/0x270<br>    [c0000000124577b0] [c0000000082ea2e0] __generic_file_write_iter+0x250/0x2a0<br>    [c000000012457810] [c0000000082ea53c] generic_file_write_iter+0x20c/0x2e0<br>    [c000000012457850] [c0000000083cc0e0] __vfs_write+0x120/0x1e0<br>    [c0000000124578e0] [c0000000083cdfc8] vfs_write+0xd8/0x220<br>    [c000000012457930] [c0000000083cfeec] SyS_write+0x6c/0x110<br>    [c000000012457980] [c000000008d154c4] xwrite+0x54/0xb8<br>    [c0000000124579c0] [c000000008d15574] do_copy+0x4c/0x17c<br>    [c0000000124579f0] [c000000008d15140] write_buffer+0x64/0x90<br>    [c000000012457a20] [c000000008d151d4] flush_buffer+0x68/0xf4<br>    [c000000012457a70] [c000000008d62268] unxz+0x210/0x398<br>    [c000000012457b10] [c000000008d15efc] unpack_to_rootfs+0x1f0/0x360<br>    [c000000012457bc0] [c000000008d16108] populate_rootfs+0x9c/0x188<br>    [c000000012457c40] [c00000000800f5d4] do_one_initcall+0x64/0x1d0<br>    [c000000012457d00] [c000000008d14474] kernel_init_freeable+0x294/0x388<br>    [c000000012457dc0] [c00000000801026c] kernel_init+0x2c/0x160<br>    [c000000012457e30] [c00000000800b560] ret_from_kernel_thread+0x5c/0x7c<br>    ------------[ cut here ]------------<br><br><br>Doing the same thing but with ppc_tm=off...<br>    # cat /proc/cmdline <br>    BOOT_IMAGE=/boot/vmlinux-4.12.14-197.18-default root=UUID=0e795e37-3692-465a-a037-c2935a9fde7a mitigations=auto quiet crashkernel=197M ppc_tm=off<br><br><br>Results in a panic at the same point, with a few registers changed:<br><br>    << snip down to panic at malloc >><br>    (gdb) <br>    Bad kernel stack pointer 7fffffffef20 at 700<br>    Oops: Bad kernel stack pointer, sig: 6 [#1]<br>    SMP NR_CPUS=2048 <br>    NUMA <br>    pSeries<br>    Modules linked in: scsi_transport_iscsi af_packet xt_tcpudp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute br_netfilter bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables ibmveth(X) vmx_crypto gf128mul crct10dif_vpmsum rtc_generic btrfs xor zstd_decompress zstd_compress xxhash raid6_pq sr_mod cdrom sd_mod ibmvscsi(X) scsi_transport_srp crc32c_vpmsum sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4<br>    Supported: Yes, External<br>    CPU: 2 PID: 3079 Comm: rand_test Tainted: G                   4.12.14-197.18-default #1 SLE15-SP1<br>    task: c00000002f6bcc00 task.stack: c0000000321fc000<br>    NIP: 0000000000000700 LR: 0000000010004acc CTR: 0000000000000000<br>    REGS: c00000001ec17d40 TRAP: 0300   Tainted: G                    (4.12.14-197.18-default)<br>    MSR: 8000000000001000 <SF,ME><br>      CR: 44000844  XER: 20000000<br>    CFAR: 00000000000010f0 DAR: ffffffffffffb27a DSISR: 40000000 SOFTE: 0 <br>    GPR00: 0000000020000000 00007fffffffef20 00000000102af788 fffffffffffffffd <br>    GPR04: 0000000000000020 0000000000000030 00000000102b0760 0000000000000001 <br>    GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730 800000000280f033 <br>    GPR12: 0000000000004000 00007fffb7ffa100 0000000000000000 0000000000000000 <br>    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 <br>    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 <br>    GPR24: 0000000000000000 0000000000000000 0000000000000000 00007fffb7fef4b8 <br>    GPR28: 00007fffb7ff0000 0000000000000000 0000000000000000 00007fffffffef20 <br>    NIP [0000000000000700] 0x700<br>    LR [0000000010004acc] 0x10004acc<br>    Call Trace:<br>    Instruction dump:<br>    00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 <br>    00000000 00000000 00000000 00000000 7db243a6 7db142a6 f92d0080 7d20e2a6 <br>    ---[ end trace 436f626dd098548c ]---<br><br>    Sending IPI to other CPUs<br>    IPI complete<br>    kexec: Starting switchover sequence.<br>    I'm in purgatory<br>     -> smp_release_cpus()<br>    spinning_secondaries = 0<br>     <- smp_release_cpus()<br>    Kernel panic - not syncing: Out of memory and no killable processes...<br><br>    CPU: 2 PID: 1 Comm: swapper/2 Not tainted 4.12.14-197.18-default #1 SLE15-SP1<br>    Call Trace:<br>    [c000000012457210] [c000000008a20140] dump_stack+0xb0/0xf0 (unreliable)<br>    [c000000012457250] [c000000008a1ccd4] panic+0x144/0x31c<br>    [c0000000124572e0] [c0000000082efcc0] out_of_memory+0x3f0/0x700<br>    [c000000012457380] [c0000000082f7ed4] __alloc_pages_nodemask+0x1004/0x10b0<br>    [c000000012457570] [c00000000837f4d8] alloc_page_interleave+0x58/0x110<br>    [c0000000124575b0] [c0000000083800bc] alloc_pages_current+0x16c/0x1d0<br>    [c000000012457610] [c0000000082e8398] __page_cache_alloc+0xd8/0x150<br>    [c000000012457650] [c0000000082e8574] pagecache_get_page+0x164/0x440<br>    [c0000000124576b0] [c0000000082e8884] grab_cache_page_write_begin+0x34/0x70<br>    [c0000000124576e0] [c00000000840ede8] simple_write_begin+0x48/0x190<br>    [c000000012457720] [c0000000082e7c7c] generic_perform_write+0xec/0x270<br>    [c0000000124577b0] [c0000000082ea2e0] __generic_file_write_iter+0x250/0x2a0<br>    [c000000012457810] [c0000000082ea53c] generic_file_write_iter+0x20c/0x2e0<br>    [c000000012457850] [c0000000083cc0e0] __vfs_write+0x120/0x1e0<br>    [c0000000124578e0] [c0000000083cdfc8] vfs_write+0xd8/0x220<br>    [c000000012457930] [c0000000083cfeec] SyS_write+0x6c/0x110<br>    [c000000012457980] [c000000008d154c4] xwrite+0x54/0xb8<br>    [c0000000124579c0] [c000000008d15574] do_copy+0x4c/0x17c<br>    [c0000000124579f0] [c000000008d15140] write_buffer+0x64/0x90<br>    [c000000012457a20] [c000000008d151d4] flush_buffer+0x68/0xf4<br>    [c000000012457a70] [c000000008d62268] unxz+0x210/0x398<br>    [c000000012457b10] [c000000008d15efc] unpack_to_rootfs+0x1f0/0x360<br>    [c000000012457bc0] [c000000008d16108] populate_rootfs+0x9c/0x188<br>    [c000000012457c40] [c00000000800f5d4] do_one_initcall+0x64/0x1d0<br>    [c000000012457d00] [c000000008d14474] kernel_init_freeable+0x294/0x388<br>    [c000000012457dc0] [c00000000801026c] kernel_init+0x2c/0x160<br>    [c000000012457e30] [c00000000800b560] ret_from_kernel_thread+0x5c/0x7c<br>    ------------[ cut here ]------------<br><br><br>Diffing the panic output looks like this (highlighting register changes?):<br><br>    74,75c79,80<br>    < CPU: 4 PID: 3082 Comm: rand_test Tainted: G                   4.12.14-197.18-default #1 SLE15-SP1<br>    < task: c00000002e226100 task.stack: c0000000387c8000<br>    ---<br>    > CPU: 2 PID: 3079 Comm: rand_test Tainted: G                   4.12.14-197.18-default #1 SLE15-SP1<br>    > task: c00000002f6bcc00 task.stack: c0000000321fc000<br>    77c82<br>    < REGS: c00000001ebffd40 TRAP: 0300   Tainted: G                    (4.12.14-197.18-default)<br>    ---<br>    > REGS: c00000001ec17d40 TRAP: 0300   Tainted: G                    (4.12.14-197.18-default)<br>    83c88<br>    < GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730 800000010280f033 <br>    ---<br>    > GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730 800000000280f033 <br>    95c100<br>    < ---[ end trace 167d5d3b2e8a06e9 ]---<br>    ---<br>    > ---[ end trace 436f626dd098548c ]---<br>    106c111<br>    < CPU: 4 PID: 1 Comm: swapper/4 Not tainted 4.12.14-197.18-default #1 SLE15-SP1<br>    ---<br>    > CPU: 2 PID: 1 Comm: swapper/2 Not tainted 4.12.14-197.18-default #1 SLE15-SP1<br> <br></div></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr">Carl Jacobsen<div>Storix, Inc.</div></div></div></div>