TM Bad Thing exception easily raised from userspace

Laurent Dufour ldufour at linux.vnet.ibm.com
Sat Aug 20 03:21:44 AEST 2016


Hi,

While working on the TM support for CRIU, I faced a TM Bad Thing exception.

Digging further, I found that it is *easy* to raised it from the user
space. I attached below a simple program which raise it all the time,
like this :

[12045.221359] Kernel BUG at c000000000050a40 [verbose debug info
unavailable]
[12045.221470] Unexpected TM Bad Thing exception at c000000000050a40
(msr 0x201033)
[12045.221540] Oops: Unrecoverable exception, sig: 6 [#1]
[12045.221586] SMP NR_CPUS=2048 NUMA PowerNV
[12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables
ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm
uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses
enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c
[12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 #34
[12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti:
c0000000fceb4000
[12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR:
0000000000000000
[12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700   Not tainted  (4.7.0)
[12045.222418] MSR: 9000000300201033 <SF,HV,ME,IR,DR,RI,LE,TM[SE]>  CR:
28444280  XER: 20000000
[12045.222625] CFAR: c0000000000163b8 SOFTE: 0
PACATMSCRATCH: 900000014280f033
GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0
GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000
GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000
GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0
[12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c
[12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0
[12045.223630] Call Trace:
[12045.223655] [c0000000fceb7d80] [c000000000026e74]
sys_rt_sigreturn+0x494/0x6c0
[12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108
[12045.223806] Instruction dump:
[12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0
7c0122a6 f80304b8
[12045.223955] 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8
7c0123a6 4e800020
[12045.224074] ---[ end trace cb8002ee240bae76 ]---

The exception is raised when the kernel is restoring the TM SPRS from
the signal stack. But this operation is not allowed while in a transaction.

The sampler test is ending the signal handler with a pending transaction
while the signal got caught during a transaction itself.

I can't see any straight way to get rid of that, except by clearing the
transactional state in the path of sigreturn....

Please advise.

Cheers,
Laurent.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sigreturnpanic.c
Type: text/x-csrc
Size: 3726 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20160819/d6aad7aa/attachment.c>


More information about the Linuxppc-dev mailing list