[BUG] 2.6.24-rc2-mm1 - kernel bug on nfs v4

Trond Myklebust trond.myklebust at fys.uio.no
Mon Nov 19 06:18:06 EST 2007


On Sun, 2007-11-18 at 19:44 +0100, Torsten Kaiser wrote:
> On Nov 18, 2007 12:05 AM, Peter Zijlstra <a.p.zijlstra at chello.nl> wrote:
> > I've been staring at this NFS code for a while an can't make any sense
> > out of it. It seems to correctly initialize the waitqueue. So this would
> > indicate corruption of some sort.
> 
> No, it does not "correctly" initialize the waitqueue. It doesn't even
> try to initialize it.
> 
> I now found the guilty patch and what is wrong with it.
> 
> nfs-stop-sillyname-renames-and-unmounts-from-racing.patch adds:
> 
> @@ -110,8 +112,22 @@ struct nfs_server {
>                                                    filesystem */
>  #endif
>         void (*destroy)(struct nfs_server *);
> +
> +       atomic_t active; /* Keep trace of any activity to this server */
> +       wait_queue_head_t active_wq;  /* Wait for any activity to stop  */
> 
> and tries to initialize it:
> @@ -593,6 +593,10 @@ static int nfs_init_server(struct nfs_server *server,
>         server->namelen  = data->namlen;
>         /* Create a client RPC handle for the NFSv3 ACL management interface */
>         nfs_init_server_aclclient(server);
> +
> +       init_waitqueue_head(&server->active_wq);
> +       atomic_set(&server->active, 0);
> +
> 
> and then uses it via nfs_sb_active and nfs_sb_deactive:
> 
> @@ -29,6 +29,7 @@ struct nfs_unlinkdata {
>  static void
>  nfs_free_unlinkdata(struct nfs_unlinkdata *data)
>  {
> +       nfs_sb_deactive(NFS_SERVER(data->dir));
>         iput(data->dir);
>         put_rpccred(data->cred);
>         kfree(data->args.name.name);
> @@ -151,6 +152,7 @@ static int nfs_do_call_unlink(struct dentry
> *parent, struct inode *dir, struct n
>                 nfs_dec_sillycount(dir);
>                 return 0;
>         }
> +       nfs_sb_active(NFS_SERVER(dir));
>         data->args.fh = NFS_FH(dir);
>         nfs_fattr_init(&data->res.dir_attr);
> 
> 
> But it does not notice this:
> struct dentry_operations nfs_dentry_operations = {
>         .d_revalidate   = nfs_lookup_revalidate,
>         .d_delete       = nfs_dentry_delete,
>         .d_iput         = nfs_dentry_iput,
> };
> struct dentry_operations nfs4_dentry_operations = {
>         .d_revalidate   = nfs_open_revalidate,
>         .d_delete       = nfs_dentry_delete,
>         .d_iput         = nfs_dentry_iput,
> };
> 
> NFSv2/3 and NFSv4 share the same dentry_iput and so share the same
> unlink and sillyrename logic.
> But they do not share nfs_init_server()!
> 
> I wonder why this doesn't blow up more violently, but only hangs...
> 
> But as I don't know if it is correct to add the workqueue
> initialization to nfs4_init_server() or remove the nfs_sb_active /
> nfs_sb_deactive for the NFSv4 case, I can't offer a patch to fix this.
> 
> Torsten

I had already fixed that one in my own stack. Attached are the 3 patches
that I've got. 1 from SteveD, 2 fixes.

Andrew, could you please unapply the sillyrename patches you've got, and
apply these 3 instead?

Trond

-------------- next part --------------
An embedded message was scrubbed...
From: Steve Dickson <SteveD at redhat.com>
Subject: NFS: Stop sillyname renames and unmounts from racing
Date: Thu, 08 Nov 2007 04:05:04 -0500
Size: 4097
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20071118/3eed1c1b/attachment.eml>
-------------- next part --------------
An embedded message was scrubbed...
From: Trond Myklebust <Trond.Myklebust at netapp.com>
Subject: NFS: Fix up problems with Steve's sillyrename fix
Date: Sat, 17 Nov 2007 13:08:49 -0500
Size: 4242
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20071118/3eed1c1b/attachment-0001.eml>
-------------- next part --------------
An embedded message was scrubbed...
From: Trond Myklebust <Trond.Myklebust at netapp.com>
Subject: NFS: Fix nfs_free_unlinkdata()
Date: Sat, 17 Nov 2007 13:52:36 -0500
Size: 1292
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20071118/3eed1c1b/attachment-0002.eml>
-------------- next part --------------
# This series applies on GIT commit 4c1fe2f78a08e2c514a39c91a0eb7b55bbd3c0d2
linux-2.6.24-005-fix_sillyrename_bug_on_umount.dif
linux-2.6.24-006-fix_to_fix_sillyrename_bug_on_umount.dif
linux-2.6.24-007-fix_nfs_free_unlinkdata.dif


More information about the Linuxppc-dev mailing list