[tip:sched/core] sched/core: Add debugging code to catch missing update_rq_clock() calls
Sachin Sant
sachinp at linux.vnet.ibm.com
Tue Jan 31 22:00:12 AEDT 2017
Trimming the cc list.
>> I assume I should be worried?
>
> Thanks for the report. No need to worry, the bug has existed for a
> while, this patch just turns on the warning ;-)
>
> The following commit queued up in tip/sched/core should fix your
> issues (assuming you see the same callstack on all your powerpc
> machines):
>
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=sched/core&id=1b1d62254df0fe42a711eb71948f915918987790
I still see this warning with today’s next running inside PowerVM LPAR
on a POWER8 box. The stack trace is different from what Michael had
reported.
Easiest way to recreate this is to Online/offline cpu’s.
[ 114.795609] rq->clock_update_flags < RQCF_ACT_SKIP
[ 114.795621] ------------[ cut here ]------------
[ 114.795632] WARNING: CPU: 2 PID: 27 at kernel/sched/sched.h:804 set_next_entity+0xbc8/0xcc0
[ 114.795634] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc rpadlpar_io rpaphp kvm_pr kvm ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iw_cxgb3 ib_core ghash_generic xts gf128mul tpm_ibmvtpm tpm sg vmx_crypto pseries_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc binfmt_misc ip_tables xfs libcrc32c sr_mod sd_mod cdrom cxgb3 ibmvscsi ibmveth scsi_transport_srp mdio
[ 114.795751] dm_mirror dm_region_hash dm_log dm_mod
[ 114.795762] CPU: 2 PID: 27 Comm: migration/2 Not tainted 4.10.0-rc6-next-20170131 #1
[ 114.795765] task: c0000004fa2f8600 task.stack: c0000004fa49c000
[ 114.795768] NIP: c000000000114ed8 LR: c000000000114ed4 CTR: c0000000004a8cf0
[ 114.795771] REGS: c0000004fa49f6a0 TRAP: 0700 Not tainted (4.10.0-rc6-next-20170131)
[ 114.795773] MSR: 8000000002823033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE>
[ 114.795787] CR: 28004022 XER: 00000000
[ 114.795789] CFAR: c0000000008ec5c4 SOFTE: 0
GPR00: c000000000114ed4 c0000004fa49f920 c00000000100dd00 0000000000000026
GPR04: 0000000000000000 0000000000000006 6574616470755f6b c0000000011cdd00
GPR08: 0000000000000000 c000000000c6edb0 000000015ef20000 d000000006488538
GPR12: 0000000000004400 c00000000e801200 c0000000000ecc38 c0000004fe064300
GPR16: 0000000000000000 0000000000000001 0000000000000000 c000000000f27e08
GPR20: c000000000f277c5 0000000000000000 0000000000000004 0000000000000000
GPR24: c00000015fba49f0 c000000000f27e08 c000000000ef9e80 c0000004fa49fb00
GPR28: c00000015fba4980 c00000015fba49f0 c0000004f34c1000 c00000015fba49f0
[ 114.795850] NIP [c000000000114ed8] set_next_entity+0xbc8/0xcc0
[ 114.795855] LR [c000000000114ed4] set_next_entity+0xbc4/0xcc0
[ 114.795857] Call Trace:
[ 114.795862] [c0000004fa49f920] [c000000000114ed4] set_next_entity+0xbc4/0xcc0 (unreliable)
[ 114.795869] [c0000004fa49f9d0] [c000000000119f4c] pick_next_task_fair+0xfc/0x6f0
[ 114.795874] [c0000004fa49fae0] [c000000000104820] sched_cpu_dying+0x3c0/0x450
[ 114.795880] [c0000004fa49fb80] [c0000000000c1958] cpuhp_invoke_callback+0x148/0x5b0
[ 114.795886] [c0000004fa49fbf0] [c0000000000c3340] take_cpu_down+0xb0/0x110
[ 114.795893] [c0000004fa49fc50] [c0000000001a1e58] multi_cpu_stop+0x1a8/0x1e0
[ 114.795899] [c0000004fa49fca0] [c0000000001a20c4] cpu_stopper_thread+0x104/0x1e0
[ 114.795905] [c0000004fa49fd60] [c0000000000f2b90] smpboot_thread_fn+0x290/0x2a0
[ 114.795911] [c0000004fa49fdc0] [c0000000000ecd7c] kthread+0x14c/0x190
[ 114.795919] [c0000004fa49fe30] [c00000000000b4e8] ret_from_kernel_thread+0x5c/0x74
[ 114.795921] Instruction dump:
[ 114.795924] 0fe00000 4bfff884 3d02fff2 89289ac5 2f890000 40fef4ec 39200001 3c62ffac
[ 114.795936] 38633698 99289ac5 487d76b5 60000000 <0fe00000> 4bfff4cc eb9f0118 e93f0120
[ 114.795948] ---[ end trace 5c822f32f967fbc5 ]---
[ 123.059141] nr_pdflush_threads exported in /proc is scheduled for removal
Thanks
-Sachin
More information about the Linuxppc-dev
mailing list