[PATCH net-next] af_unix: fix a fatal race with bit fields

Benjamin Herrenschmidt benh at kernel.crashing.org
Fri May 3 22:57:27 EST 2013


On Fri, 2013-05-03 at 11:01 +0930, Alan Modra wrote:
> On Tue, Apr 30, 2013 at 10:04:32PM -0700, Eric Dumazet wrote:
> > These kind of errors are pretty hard to find, its a pity to spend time
> > on them.
> 
> Well, yes.  From the first comment in gcc PR52080.  "For the following
> testcase we generate a 8 byte RMW cycle on IA64 which causes locking
> problems in the linux kernel btrfs filesystem."
> 
> Did someone fix btrfs, but not check other kernel locks?  Having now
> hit the same problem again, have you checked that other kernel locks
> don't have adjacent bit fields in the same 64-bit word?  And comment
> the struct to ensure someone doesn't optimize those unsigned chars
> back to bit fields.

Unfortunately, fixing "other" kernel locks is near impossible.

One could try to grep for all spinlock_t and maybe even all atomic_t,
may even write a script to spot automatically if a bitfield appears
to be around (though it could be hidden behind a structure etc...) but
what about an int accessed with cmxchg (a kernel macro doing a
lwarx/stwcx. loop on a value) for example ? There's plenty of these...

I don't think we can realistically "fix" all potential occurrences of
that bug in the kernel short of geting rid of all bitfields, which isn't
going to happen any time soon.

I'm afraid this *must* be fixed at the compiler level, with as backports
much as can realistically be done back to distros.

Ben.
 



More information about the Linuxppc-dev mailing list