sunrpc: Don't engage exponential backoff when connection attempt is rejected.
authorNeilBrown <neilb@suse.com>
Wed, 23 Nov 2016 03:44:58 +0000 (14:44 +1100)
committerTrond Myklebust <trond.myklebust@primarydata.com>
Thu, 1 Dec 2016 22:40:41 +0000 (17:40 -0500)
xs_connect() contains an exponential backoff mechanism so the repeated
connection attempts are delayed by longer and longer amounts.

This is appropriate when the connection failed due to a timeout, but
it not appropriate when a definitive "no" answer is received.  In such
cases, call_connect_status() imposes a minimum 3-second back-off, so
not having the exponetial back-off will never result in immediate
retries.

The current situation is a problem when the NFS server tries to
register with rpcbind but rpcbind isn't running.  All connection
attempts are made on the same "xprt" and as the connection is never
"closed", the exponential back delays successive attempts to register,
or de-register, different protocols.  This results in a multi-minute
delay with no benefit.

So, when call_connect_status() receives a definitive "no", use
xprt_conditional_disconnect() to cancel the previous connection attempt.
This will set XPRT_CLOSE_WAIT so that xprt->ops->close() calls xs_close()
which resets the reestablish_timeout.

To ensure xprt_conditional_disconnect() does the right thing, we
ensure that rq_connect_cookie is set before a connection attempt, and
allow xprt_conditional_disconnect() to complete even when the
transport is not fully connected.

Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
net/sunrpc/clnt.c
net/sunrpc/xprt.c

index 62a482790937b54a5d486bc932289b0fb14da248..1efbe48e794f804b24f30e566994b873a85727bb 100644 (file)
@@ -1926,6 +1926,8 @@ call_connect_status(struct rpc_task *task)
        case -EADDRINUSE:
        case -ENOBUFS:
        case -EPIPE:
+               xprt_conditional_disconnect(task->tk_rqstp->rq_xprt,
+                                           task->tk_rqstp->rq_connect_cookie);
                if (RPC_IS_SOFTCONN(task))
                        break;
                /* retry with existing socket, after a delay */
index 685e6d225414ee55f57237873a902b4c313e54c4..9a6be030ca7d21dbfee63664536872cc395bdc55 100644 (file)
@@ -669,7 +669,7 @@ void xprt_conditional_disconnect(struct rpc_xprt *xprt, unsigned int cookie)
        spin_lock_bh(&xprt->transport_lock);
        if (cookie != xprt->connect_cookie)
                goto out;
-       if (test_bit(XPRT_CLOSING, &xprt->state) || !xprt_connected(xprt))
+       if (test_bit(XPRT_CLOSING, &xprt->state))
                goto out;
        set_bit(XPRT_CLOSE_WAIT, &xprt->state);
        /* Try to schedule an autoclose RPC call */
@@ -772,6 +772,7 @@ void xprt_connect(struct rpc_task *task)
        if (!xprt_connected(xprt)) {
                task->tk_rqstp->rq_bytes_sent = 0;
                task->tk_timeout = task->tk_rqstp->rq_timeout;
+               task->tk_rqstp->rq_connect_cookie = xprt->connect_cookie;
                rpc_sleep_on(&xprt->pending, task, xprt_connect_status);
 
                if (test_bit(XPRT_CLOSING, &xprt->state))