[RFC 0/2] Reenable might_sleep() checks for might_fault() when atomic

David Hildenbrand dahi at linux.vnet.ibm.com
Thu Nov 27 02:32:07 AEDT 2014


> On Wed, Nov 26, 2014 at 05:17:29PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Nov 26, 2014 at 11:05:04AM +0100, David Hildenbrand wrote:
> > > > What's the path you are trying to debug?
> > > 
> > > Well, we had a problem where we held a spin_lock and called
> > > copy_(from|to)_user(). We experienced very random deadlocks that took some guy
> > > almost a week to debug. The simple might_sleep() check would have showed this
> > > error immediately.
> > 
> > This must have been a very old kernel.
> > A modern kernel will return an error from copy_to_user.
> > Which is really the point of the patch you are trying to revert.
> 
> That's assuming you disabled preemption. If you didn't, and take
> a spinlock, you have deadlocks even without userspace access.
> 

(Thanks for your resent, my first email was sent directly to you ... grml)

This is what happened on our side (very recent kernel):

spin_lock(&lock)
copy_to_user(...)
spin_unlock(&lock)

1. s390 locks/unlocks a spin lock with a compare and swap, using the _cpu id_
   as "old value"
2. we slept during copy_to_user()
3. the thread got scheduled onto another cpu
4. spin_unlock failed as the _cpu id_ didn't match (another cpu that locked
   the spinlock tried to unlocked it).
5. lock remained locked -> deadlock

Christian came up with the following explanation:
Without preemption, spin_lock() will not touch the preempt counter.
disable_pfault() will always touch it.

Therefore, with preemption disabled, copy_to_user() has no idea that it is
running in atomic context - and will therefore try to sleep.

So copy_to_user() will on s390:
1. run "as atomic" while spin_lock() with preemption enabled.
2. run "as not atomic" while spin_lock() with preemption disabled.
3.  run "as atomic" while pagefault_disabled() with preemption enabled or
disabled.
4. run "as not atomic" when really not atomic.

And exactly nr 2. is the thing that produced the deadlock in our scenario and
the reason why I want a might_sleep() :)



More information about the Linuxppc-dev mailing list