Kernel panic from malloc() on SUSE 15.1?
Carl Jacobsen
cjacobsen at storix.com
Fri Nov 6 06:44:06 AEDT 2020
On Thu, Nov 5, 2020 at 2:19 AM Michael Ellerman <mpe at ellerman.id.au> wrote:
> Carl Jacobsen <cjacobsen at storix.com> writes:
> > The panic (on a call to malloc from static linked libcrypto) looks like
> > this:
>
> What hardware is this on?
>
Thank you for looking into this.
The system that's panicking identifies like this:
# uname -a
Linux sl151pwr8 4.12.14-197.18-default #1 SMP Tue Sep 17 14:26:49 UTC
2019
(d75059b) ppc64le ppc64le ppc64le GNU/Linux
#
# cat /etc/os-release
NAME="SLES"
VERSION="15-SP1"
VERSION_ID="15.1"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP1"
ID="sles"
ID_LIKE="suse"
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:suse:sles:15:sp1"
The system is an LPAR running under PowerVM vios version 2.2.3.4.
The underlying hardware is machine type-model 8284-22A.
> Can you try booting with ppc_tm=off on the kernel command line, and see
> if that changes anything?
>
Yes. Output is down below. Doesn't appear to change much, but I don't have
the background to interpret the registers.
> Can you put your compiled test program up somewhere we can download it
> and look at? Or post the disassembly?
>
Here's the source file:
https://www.storix.com/download/support/misc/rand_test.c
Here's the resulting executable:
https://www.storix.com/download/support/misc/rand_test
Executable is linked to libcrypto from openssl-1.1.1g, configured with:
./config no-shared no-dso no-threads -fPIC -ggdb3 -debug -static
Executable is built (on SUSE 12) with:
gcc -ggdb3 -o rand_test rand_test.c libcrypto.a
And running the executable (on SUSE 15.1) through gdb goes like this:
# gdb --args ./rand_test
GNU gdb (GDB; SUSE Linux Enterprise 15) 8.3.1
<< snip intro text >>
Reading symbols from ./rand_test...
(gdb) b main
Breakpoint 1 at 0x1000288c: file rand_test.c, line 6.
(gdb) r
Starting program: /tmp/ossl/rand_test
Breakpoint 1, main (argc=1, argv=0x7ffffffff798) at rand_test.c:6
6 int has_enough_data = RAND_status();
(gdb) s
RAND_status () at crypto/rand/rand_lib.c:958
958 const RAND_METHOD *meth = RAND_get_rand_method();
(gdb)
RAND_get_rand_method () at crypto/rand/rand_lib.c:844
844 const RAND_METHOD *tmp_meth = NULL;
(gdb)
846 if (!RUN_ONCE(&rand_init, do_rand_init))
(gdb)
CRYPTO_THREAD_run_once (once=0x102a7d88 <rand_init>,
init=0x10002f30 <do_rand_init_ossl_>) at crypto/threads_none.c:67
67 if (*once != 0)
(gdb)
70 init();
(gdb)
do_rand_init_ossl_ () at crypto/rand/rand_lib.c:306
306 DEFINE_RUN_ONCE_STATIC(do_rand_init)
(gdb)
do_rand_init () at crypto/rand/rand_lib.c:309
309 rand_engine_lock = CRYPTO_THREAD_lock_new();
(gdb)
CRYPTO_THREAD_lock_new () at crypto/threads_none.c:24
24 if ((lock = OPENSSL_zalloc(sizeof(unsigned int))) == NULL) {
(gdb)
CRYPTO_zalloc (num=4, file=0x1023a500 "crypto/threads_none.c", line=24)
at crypto/mem.c:230
230 void *ret = CRYPTO_malloc(num, file, line);
(gdb)
CRYPTO_malloc (num=4, file=0x1023a500 "crypto/threads_none.c", line=24)
at crypto/mem.c:194
194 void *ret = NULL;
(gdb)
197 if (malloc_impl != NULL && malloc_impl != CRYPTO_malloc)
(gdb)
200 if (num == 0)
(gdb)
204 if (allow_customize) {
(gdb)
210 allow_customize = 0;
(gdb)
222 ret = malloc(num);
(gdb)
Bad kernel stack pointer 7fffffffef20 at 700
Oops: Bad kernel stack pointer, sig: 6 [#1]
SMP NR_CPUS=2048
NUMA
pSeries
Modules linked in: scsi_transport_iscsi af_packet xt_tcpudp
ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ip_set nfnetlink
ebtable_nat ebtable_broute br_netfilter bridge stp llc ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw
ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security
ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables
x_tables ibmveth(X) vmx_crypto gf128mul crct10dif_vpmsum rtc_generic btrfs
xor zstd_decompress zstd_compress xxhash raid6_pq sr_mod cdrom sd_mod
ibmvscsi(X) scsi_transport_srp crc32c_vpmsum sg dm_multipath dm_mod
scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
Supported: Yes, External
CPU: 4 PID: 3082 Comm: rand_test Tainted: G
4.12.14-197.18-default #1 SLE15-SP1
task: c00000002e226100 task.stack: c0000000387c8000
NIP: 0000000000000700 LR: 0000000010004acc CTR: 0000000000000000
REGS: c00000001ebffd40 TRAP: 0300 Tainted: G
(4.12.14-197.18-default)
MSR: 8000000000001000 <SF,ME>
CR: 44000844 XER: 20000000
CFAR: 00000000000010f0 DAR: ffffffffffffb27a DSISR: 40000000 SOFTE: 0
GPR00: 0000000020000000 00007fffffffef20 00000000102af788
fffffffffffffffd
GPR04: 0000000000000020 0000000000000030 00000000102b0760
0000000000000001
GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730
800000010280f033
GPR12: 0000000000004000 00007fffb7ffa100 0000000000000000
0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
GPR24: 0000000000000000 0000000000000000 0000000000000000
00007fffb7fef4b8
GPR28: 00007fffb7ff0000 0000000000000000 0000000000000000
00007fffffffef20
NIP [0000000000000700] 0x700
LR [0000000010004acc] 0x10004acc
Call Trace:
Instruction dump:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 7db243a6 7db142a6 f92d0080 7d20e2a6
---[ end trace 167d5d3b2e8a06e9 ]---
Sending IPI to other CPUs
IPI complete
kexec: Starting switchover sequence.
I'm in purgatory
-> smp_release_cpus()
spinning_secondaries = 0
<- smp_release_cpus()
Kernel panic - not syncing: Out of memory and no killable processes...
CPU: 4 PID: 1 Comm: swapper/4 Not tainted 4.12.14-197.18-default #1
SLE15-SP1
Call Trace:
[c000000012457210] [c000000008a20140] dump_stack+0xb0/0xf0 (unreliable)
[c000000012457250] [c000000008a1ccd4] panic+0x144/0x31c
[c0000000124572e0] [c0000000082efcc0] out_of_memory+0x3f0/0x700
[c000000012457380] [c0000000082f7ed4]
__alloc_pages_nodemask+0x1004/0x10b0
[c000000012457570] [c00000000837f4d8] alloc_page_interleave+0x58/0x110
[c0000000124575b0] [c0000000083800bc] alloc_pages_current+0x16c/0x1d0
[c000000012457610] [c0000000082e8398] __page_cache_alloc+0xd8/0x150
[c000000012457650] [c0000000082e8574] pagecache_get_page+0x164/0x440
[c0000000124576b0] [c0000000082e8884]
grab_cache_page_write_begin+0x34/0x70
[c0000000124576e0] [c00000000840ede8] simple_write_begin+0x48/0x190
[c000000012457720] [c0000000082e7c7c] generic_perform_write+0xec/0x270
[c0000000124577b0] [c0000000082ea2e0]
__generic_file_write_iter+0x250/0x2a0
[c000000012457810] [c0000000082ea53c]
generic_file_write_iter+0x20c/0x2e0
[c000000012457850] [c0000000083cc0e0] __vfs_write+0x120/0x1e0
[c0000000124578e0] [c0000000083cdfc8] vfs_write+0xd8/0x220
[c000000012457930] [c0000000083cfeec] SyS_write+0x6c/0x110
[c000000012457980] [c000000008d154c4] xwrite+0x54/0xb8
[c0000000124579c0] [c000000008d15574] do_copy+0x4c/0x17c
[c0000000124579f0] [c000000008d15140] write_buffer+0x64/0x90
[c000000012457a20] [c000000008d151d4] flush_buffer+0x68/0xf4
[c000000012457a70] [c000000008d62268] unxz+0x210/0x398
[c000000012457b10] [c000000008d15efc] unpack_to_rootfs+0x1f0/0x360
[c000000012457bc0] [c000000008d16108] populate_rootfs+0x9c/0x188
[c000000012457c40] [c00000000800f5d4] do_one_initcall+0x64/0x1d0
[c000000012457d00] [c000000008d14474] kernel_init_freeable+0x294/0x388
[c000000012457dc0] [c00000000801026c] kernel_init+0x2c/0x160
[c000000012457e30] [c00000000800b560] ret_from_kernel_thread+0x5c/0x7c
------------[ cut here ]------------
Doing the same thing but with ppc_tm=off...
# cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinux-4.12.14-197.18-default
root=UUID=0e795e37-3692-465a-a037-c2935a9fde7a mitigations=auto quiet
crashkernel=197M ppc_tm=off
Results in a panic at the same point, with a few registers changed:
<< snip down to panic at malloc >>
(gdb)
Bad kernel stack pointer 7fffffffef20 at 700
Oops: Bad kernel stack pointer, sig: 6 [#1]
SMP NR_CPUS=2048
NUMA
pSeries
Modules linked in: scsi_transport_iscsi af_packet xt_tcpudp
ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ip_set nfnetlink
ebtable_nat ebtable_broute br_netfilter bridge stp llc ip6table_nat
nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw
ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security
ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables
x_tables ibmveth(X) vmx_crypto gf128mul crct10dif_vpmsum rtc_generic btrfs
xor zstd_decompress zstd_compress xxhash raid6_pq sr_mod cdrom sd_mod
ibmvscsi(X) scsi_transport_srp crc32c_vpmsum sg dm_multipath dm_mod
scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
Supported: Yes, External
CPU: 2 PID: 3079 Comm: rand_test Tainted: G
4.12.14-197.18-default #1 SLE15-SP1
task: c00000002f6bcc00 task.stack: c0000000321fc000
NIP: 0000000000000700 LR: 0000000010004acc CTR: 0000000000000000
REGS: c00000001ec17d40 TRAP: 0300 Tainted: G
(4.12.14-197.18-default)
MSR: 8000000000001000 <SF,ME>
CR: 44000844 XER: 20000000
CFAR: 00000000000010f0 DAR: ffffffffffffb27a DSISR: 40000000 SOFTE: 0
GPR00: 0000000020000000 00007fffffffef20 00000000102af788
fffffffffffffffd
GPR04: 0000000000000020 0000000000000030 00000000102b0760
0000000000000001
GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730
800000000280f033
GPR12: 0000000000004000 00007fffb7ffa100 0000000000000000
0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000
0000000000000000
GPR24: 0000000000000000 0000000000000000 0000000000000000
00007fffb7fef4b8
GPR28: 00007fffb7ff0000 0000000000000000 0000000000000000
00007fffffffef20
NIP [0000000000000700] 0x700
LR [0000000010004acc] 0x10004acc
Call Trace:
Instruction dump:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 7db243a6 7db142a6 f92d0080 7d20e2a6
---[ end trace 436f626dd098548c ]---
Sending IPI to other CPUs
IPI complete
kexec: Starting switchover sequence.
I'm in purgatory
-> smp_release_cpus()
spinning_secondaries = 0
<- smp_release_cpus()
Kernel panic - not syncing: Out of memory and no killable processes...
CPU: 2 PID: 1 Comm: swapper/2 Not tainted 4.12.14-197.18-default #1
SLE15-SP1
Call Trace:
[c000000012457210] [c000000008a20140] dump_stack+0xb0/0xf0 (unreliable)
[c000000012457250] [c000000008a1ccd4] panic+0x144/0x31c
[c0000000124572e0] [c0000000082efcc0] out_of_memory+0x3f0/0x700
[c000000012457380] [c0000000082f7ed4]
__alloc_pages_nodemask+0x1004/0x10b0
[c000000012457570] [c00000000837f4d8] alloc_page_interleave+0x58/0x110
[c0000000124575b0] [c0000000083800bc] alloc_pages_current+0x16c/0x1d0
[c000000012457610] [c0000000082e8398] __page_cache_alloc+0xd8/0x150
[c000000012457650] [c0000000082e8574] pagecache_get_page+0x164/0x440
[c0000000124576b0] [c0000000082e8884]
grab_cache_page_write_begin+0x34/0x70
[c0000000124576e0] [c00000000840ede8] simple_write_begin+0x48/0x190
[c000000012457720] [c0000000082e7c7c] generic_perform_write+0xec/0x270
[c0000000124577b0] [c0000000082ea2e0]
__generic_file_write_iter+0x250/0x2a0
[c000000012457810] [c0000000082ea53c]
generic_file_write_iter+0x20c/0x2e0
[c000000012457850] [c0000000083cc0e0] __vfs_write+0x120/0x1e0
[c0000000124578e0] [c0000000083cdfc8] vfs_write+0xd8/0x220
[c000000012457930] [c0000000083cfeec] SyS_write+0x6c/0x110
[c000000012457980] [c000000008d154c4] xwrite+0x54/0xb8
[c0000000124579c0] [c000000008d15574] do_copy+0x4c/0x17c
[c0000000124579f0] [c000000008d15140] write_buffer+0x64/0x90
[c000000012457a20] [c000000008d151d4] flush_buffer+0x68/0xf4
[c000000012457a70] [c000000008d62268] unxz+0x210/0x398
[c000000012457b10] [c000000008d15efc] unpack_to_rootfs+0x1f0/0x360
[c000000012457bc0] [c000000008d16108] populate_rootfs+0x9c/0x188
[c000000012457c40] [c00000000800f5d4] do_one_initcall+0x64/0x1d0
[c000000012457d00] [c000000008d14474] kernel_init_freeable+0x294/0x388
[c000000012457dc0] [c00000000801026c] kernel_init+0x2c/0x160
[c000000012457e30] [c00000000800b560] ret_from_kernel_thread+0x5c/0x7c
------------[ cut here ]------------
Diffing the panic output looks like this (highlighting register changes?):
74,75c79,80
< CPU: 4 PID: 3082 Comm: rand_test Tainted: G
4.12.14-197.18-default #1 SLE15-SP1
< task: c00000002e226100 task.stack: c0000000387c8000
---
> CPU: 2 PID: 3079 Comm: rand_test Tainted: G
4.12.14-197.18-default #1 SLE15-SP1
> task: c00000002f6bcc00 task.stack: c0000000321fc000
77c82
< REGS: c00000001ebffd40 TRAP: 0300 Tainted: G
(4.12.14-197.18-default)
---
> REGS: c00000001ec17d40 TRAP: 0300 Tainted: G
(4.12.14-197.18-default)
83c88
< GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730
800000010280f033
---
> GPR08: 0000000000000000 00007fffb7dacc00 00000000102b0730
800000000280f033
95c100
< ---[ end trace 167d5d3b2e8a06e9 ]---
---
> ---[ end trace 436f626dd098548c ]---
106c111
< CPU: 4 PID: 1 Comm: swapper/4 Not tainted 4.12.14-197.18-default #1
SLE15-SP1
---
> CPU: 2 PID: 1 Comm: swapper/2 Not tainted 4.12.14-197.18-default #1
SLE15-SP1
--
Carl Jacobsen
Storix, Inc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20201105/9cbc820f/attachment-0001.htm>
More information about the Linuxppc-dev
mailing list