NFS: Fix possible endless state recovery wait
authorChuck Lever <chuck.lever@oracle.com>
Thu, 17 Oct 2013 18:14:10 +0000 (14:14 -0400)
committerTrond Myklebust <Trond.Myklebust@netapp.com>
Mon, 28 Oct 2013 19:31:55 +0000 (15:31 -0400)
In nfs4_wait_clnt_recover(), hold a reference to the clp being
waited on.  The state manager can reduce clp->cl_count to 1, in
which case the nfs_put_client() in nfs4_run_state_manager() can
free *clp before wait_on_bit() returns and allows
nfs4_wait_clnt_recover() to run again.

The behavior at that point is non-deterministic.  If the waited-on
bit still happens to be zero, wait_on_bit() will wake the waiter as
expected.  If the bit is set again (say, if the memory was poisoned
when freed) wait_on_bit() can leave the waiter asleep.

This is a narrow fix which ensures the safety of accessing *clp in
nfs4_wait_clnt_recover(), but does not address the continued use
of a possibly freed *clp after nfs4_wait_clnt_recover() returns
(see nfs_end_delegation_return(), for example).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
fs/nfs/nfs4state.c

index 62c08bf60e655eb405fececa732061c2b992317e..452f4c8dadeae77f2313d1c80cba200203a10396 100644 (file)
@@ -1255,14 +1255,16 @@ int nfs4_wait_clnt_recover(struct nfs_client *clp)
 
        might_sleep();
 
+       atomic_inc(&clp->cl_count);
        res = wait_on_bit(&clp->cl_state, NFS4CLNT_MANAGER_RUNNING,
                        nfs_wait_bit_killable, TASK_KILLABLE);
        if (res)
-               return res;
-
+               goto out;
        if (clp->cl_cons_state < 0)
-               return clp->cl_cons_state;
-       return 0;
+               res = clp->cl_cons_state;
+out:
+       nfs_put_client(clp);
+       return res;
 }
 
 int nfs4_client_recover_expired_lease(struct nfs_client *clp)