Repeated corruption of file->f_ep_lock

Mon Sep 19 09:23:56 EST 2005

On Sat, Sep 17, 2005 at 12:27:17PM +0100, David Woodhouse wrote:
> For a while I've been seeing occasional deadlocks on one CPU of a PPC
> SMP machine:
> 
> _spin_lock(c8cbf250) CPU#1 NIP c02bb740 holder: cpu 2305 pc 00000000 (lock 24000484)
> 
> Further debugging shows that it's always due to file->f_ep_lock being
> corrupted, and the deadlock happens when epoll is used on such a file.
> The owner_cpu field is almost always 2305. However, it's not due to the
> epoll code itself -- I've turned all three of the epoll syscalls into
> sys_ni_syscall and it's still happening. I also added sanity checks for
> (file->f_ep_lock.owner_cpu > 1) throughout fs/file_table.c, and I see it
> happen ten or twenty times during a kernel compile.
> 
> The previous and next members of 'struct file', which are f_ep_list and
> f_mapping respectively, are always fine. It's just f_ep_lock which is
> scribbled upon, and the scribble is fairly repeatable: 'owner_cpu' is
> almost always set to 0x901 but occasionally 0x501, and the 'lock' field
> has values like 20282484, 24042884, 28022484, 24042084, 22000424 (hex).
> Do those numbers seem meaningful to anyone? Any clues as to where they
> might be coming from?

As Paul mentioned, these ones furiously look like the contents of 
a condition register, which is not saved and restored very often:
- on every exception/interrupt
- in some stack frames, when GCC decides that it needs to use some
of the 3 CR fields (out of 8) that must be preserved across function
calls. This is rather infrequent.

> 
> During a kernel compile, the corruption is mostly detected in fget()
> from vfs_fstat(), but also I've seen it once or twice in vfs_read() from
> do_execve():
> 
>  File cb2f5b40 (fops d107c980) has corrupted f_epoll_lock!
>  lock 24002484, owner_pc 0, owner_cpu 901
>  f->private_data 00000000, f->f_ep_links (cb2f5bc8, cb2f5bc8), f->f_mapping cc21c1c8
>  f->f_mapping->a_ops d107cad8
>  Pid 16648, comm gcc
>  File is /usr/bin/gcc
>  Badness in dumpbadfile at fs/file_table.c:133
>  Call trace:
>   [c00059b8] check_bug_trap+0xa8/0x120
>   [c0005c94] ProgramCheckException+0x264/0x4e0
>   [c00050a8] ret_from_except_full+0x0/0x4c
>   [c0080bb4] dumpbadfile+0x114/0x160
>   [c007f9f0] vfs_read+0xa0/0x1c0
>   [c008ef7c] kernel_read+0x3c/0x60
>   [c0091810] do_execve+0x1e0/0x280
>   [c0008594] sys_execve+0x64/0xd0
>   [c0004980] ret_from_syscall+0x0/0x44

It's hard to imagine a stack overflow on such a short
call chain. The other idea I have is a backlink chain
corruption, but GCC generated code is not very sensitive
to it unless you use alloca()...

In this case, it would be very useful to also have the 
value of the stack pointer (r1) on each line in the call
backtrace. (The PPC ABI makes the call backtrace much more 
reliable than on x86, where the backtrace without frame pointer
is an educated guess at best).

	Regards,
	Gabriel