[next 20170227] CPU remove DLPAR operation WARN @ lib/refcount.c:128
Sachin Sant
sachinp at linux.vnet.ibm.com
Mon Feb 27 20:35:25 AEDT 2017
With Feb 27 next tree I am seeing inconsistent results on a CPU remove
DLPAR operation on a POWER8 LPAR.
After the cpu remove operation the SMT capability of the LPAR is disabled.
# uname -r
4.10.0-next-20170227
# ppc64_cpu --smt
SMT=8
# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 2
NUMA node(s): 4
Model: 2.1 (pvr 004b 0201)
Model name: POWER8 (architected), altivec supported
L1d cache: 64K
L1i cache: 32K
L2 cache: 512K
L3 cache: 8192K
NUMA node0 CPU(s):
NUMA node1 CPU(s): 0-7
NUMA node3 CPU(s):
NUMA node4 CPU(s): 8-15
After a DLPAR operation (CPU remove : 2 to 1) all the cpu seems to be
removed. at the end of it I also see a warning @lib/refcount.c:128
SMT capability is show as disabled. It should have remained at 8.
# ppc64_cpu —smt
Machine is not SMT capable
lscpu o/p shows 8 online cpus, with threads per core as 8.
[root at alp12 ~]# lscpu
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 8-15
Thread(s) per core: 8
Core(s) per socket: 1
Socket(s): 1
NUMA node(s): 4
Model: 2.1 (pvr 004b 0201)
Model name: POWER8 (architected), altivec supported
L1d cache: 64K
L1i cache: 32K
NUMA node0 CPU(s):
NUMA node1 CPU(s):
NUMA node3 CPU(s):
NUMA node4 CPU(s): 8-15
[root at alp12 ~]
[ 196.910677] cpu 8 (hwid 8) Ready to die...
[ 197.120324] cpu 9 (hwid 9) Ready to die...
[ 197.290265] cpu 10 (hwid 10) Ready to die...
[ 197.490234] cpu 11 (hwid 11) Ready to die...
[ 197.630110] cpu 12 (hwid 12) Ready to die...
[ 197.790094] cpu 13 (hwid 13) Ready to die...
[ 197.980016] cpu 14 (hwid 14) Ready to die...
[ 198.098137] cpu 15 (hwid 15) Ready to die...
[ 198.210074] pseries-hotplug-cpu: Failed to release drc (10000008) for CPU PowerPC,POWER8, rc: -17
[ 199.050648] cpu 0 (hwid 0) Ready to die...
[ 199.220530] cpu 1 (hwid 1) Ready to die...
[ 199.370459] cpu 2 (hwid 2) Ready to die...
[ 199.600322] cpu 3 (hwid 3) Ready to die...
[ 199.770259] cpu 4 (hwid 4) Ready to die...
[ 199.960189] cpu 5 (hwid 5) Ready to die...
[ 200.140145] cpu 6 (hwid 6) Ready to die...
[ 200.258067] cpu 7 (hwid 7) Ready to die...
[ 200.360320] refcount_t: underflow; use-after-free.
[ 200.360371] ------------[ cut here ]------------
[ 200.360385] WARNING: CPU: 10 PID: 7194 at lib/refcount.c:128 refcount_sub_and_test+0xb8/0xf0
[ 200.360398] Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp rpadlpar_io rpaphp tun bridge stp llc kvm iptable_filter vmx_crypto pseries_rng rng_core binfmt_misc nfsd ip_tables x_tables autofs4
[ 200.360472] CPU: 10 PID: 7194 Comm: drmgr Tainted: G W 4.10.0-next-20170227 #3
[ 200.360478] task: c0000008b7222b00 task.stack: c0000008b72dc000
[ 200.360483] NIP: c000000001b6b4b8 LR: c000000001b6b4b4 CTR: c000000001cefb50
[ 200.360488] REGS: c0000008b72df860 TRAP: 0700 Tainted: G W (4.10.0-next-20170227)
[ 200.360494] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE>
[ 200.360506] CR: 22000422 XER: 00000007
[ 200.360511] CFAR: c000000001faf738 SOFTE: 1
[ 200.360511] GPR00: c000000001b6b4b4 c0000008b72dfae0 c00000000266c300 0000000000000026
[ 200.360511] GPR04: c00000050fd8adb0 c00000050fda1660 0000000000419000 000000000000ff00
[ 200.360511] GPR08: 0000000000000000 c00000000235143c 000000050da40000 00000000000001d7
[ 200.360511] GPR12: 0000000000000000 c00000000ea82800 0000000000000000 0000000000000000
[ 200.360511] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 200.360511] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 200.360511] GPR24: 0000000000000000 0000000010018430 c0000005dd05f520 c0000008b72dfe00
[ 200.360511] GPR28: 0000000000000000 0000000000000016 0000000000000000 c0000008b71ffa18
[ 200.360570] NIP [c000000001b6b4b8] refcount_sub_and_test+0xb8/0xf0
[ 200.360575] LR [c000000001b6b4b4] refcount_sub_and_test+0xb4/0xf0
[ 200.360578] Call Trace:
[ 200.360582] [c0000008b72dfae0] [c000000001b6b4b4] refcount_sub_and_test+0xb4/0xf0 (unreliable)
[ 200.360588] [c0000008b72dfb40] [c000000001b4b0dc] kobject_put+0x3c/0xa0
[ 200.360595] [c0000008b72dfbb0] [c000000001e53bf4] of_node_put+0x24/0x40
[ 200.360602] [c0000008b72dfbd0] [c00000000165b4f4] dlpar_cpu_release+0x74/0xf0
[ 200.360608] [c0000008b72dfc20] [c0000000015e0e28] arch_cpu_release+0x38/0x70
[ 200.360615] [c0000008b72dfc40] [c000000001c49eb0] cpu_release_store+0x40/0x70
[ 200.360622] [c0000008b72dfc70] [c000000001c3d994] dev_attr_store+0x34/0x60
[ 200.360629] [c0000008b72dfc90] [c00000000191bc44] sysfs_kf_write+0x64/0xa0
[ 200.360634] [c0000008b72dfcb0] [c00000000191aa80] kernfs_fop_write+0x170/0x250
[ 200.360641] [c0000008b72dfd00] [c00000000187c330] __vfs_write+0x40/0x1c0
[ 200.360645] [c0000008b72dfd90] [c00000000187dc48] vfs_write+0xc8/0x240
[ 200.360650] [c0000008b72dfde0] [c00000000187f8b0] SyS_write+0x60/0x110
[ 200.360656] [c0000008b72dfe30] [c0000000015cb8e0] system_call+0x38/0xfc
[ 200.360660] Instruction dump:
[ 200.360663] 7d495378 419e0044 2f89ffff 7d434850 7f0a4840 79460020 41de001c 4099ffbc
[ 200.360675] 3c62ffb6 38636af8 48444249 60000000 <0fe00000> 38210060 38600000 e8010010
[ 200.360686] ---[ end trace 937482186422ac36 ]---
I have attached the dmesg log.
Thanks
-Sachin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cpu-dlpar-dmesg.log
Type: application/octet-stream
Size: 34273 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20170227/249d3c31/attachment-0001.obj>
More information about the Linuxppc-dev
mailing list