drbd: Bugfix for the connection behavior
authorPhilipp Reisner <philipp.reisner@linbit.com>
Thu, 8 Nov 2012 14:04:36 +0000 (15:04 +0100)
committerPhilipp Reisner <philipp.reisner@linbit.com>
Thu, 8 Nov 2012 15:57:59 +0000 (16:57 +0100)
If we get into the C_BROKEN_PIPE cstate once, the state engine set the
thi->t_state of the receiver thread to restarting.  But with the while loop
in drbdd_init() a new connection gets established. After the call into
drbdd() returns immediately since the thi->t_state is not RUNNING.  The
restart of drbd_init() then resets thi->t_state to RUNNING.

I.e. after entering C_BROKEN_PIPE once, the next successful established
connection gets wasted.

The two parts of the fix:
  * Do not cause the thread to restart if we detect the issue
    with the sockets while we are in C_WF_CONNECTION.

  * Make sure that all actions that would have set us to C_BROKEN_PIPE
    happen before the state change to C_WF_REPORT_PARAMS.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
drivers/block/drbd/drbd_receiver.c
drivers/block/drbd/drbd_state.c

index 8d521219480674e7dce71e11cad889d001968347..fff55657e0da888cae09a61ec85b491bc1326614 100644 (file)
@@ -987,14 +987,9 @@ retry:
                }
        }
 
-       if (conn_request_state(tconn, NS(conn, C_WF_REPORT_PARAMS), CS_VERBOSE) < SS_SUCCESS)
-               return 0;
-
        sock->sk->sk_sndtimeo = timeout;
        sock->sk->sk_rcvtimeo = MAX_SCHEDULE_TIMEOUT;
 
-       drbd_thread_start(&tconn->asender);
-
        if (drbd_send_protocol(tconn) == -EOPNOTSUPP)
                return -1;
 
@@ -1008,6 +1003,11 @@ retry:
        }
        rcu_read_unlock();
 
+       if (conn_request_state(tconn, NS(conn, C_WF_REPORT_PARAMS), CS_VERBOSE) < SS_SUCCESS)
+               return 0;
+
+       drbd_thread_start(&tconn->asender);
+
        return h;
 
 out_release_sockets:
index 1132d87fa2845890f4b640f05118c77a486fad38..ecc5e27616682ca99d426dd55d21730eed9fca44 100644 (file)
@@ -1055,7 +1055,7 @@ __drbd_set_state(struct drbd_conf *mdev, union drbd_state ns,
                drbd_thread_stop_nowait(&mdev->tconn->receiver);
 
        /* Upon network failure, we need to restart the receiver. */
-       if (os.conn > C_TEAR_DOWN &&
+       if (os.conn > C_WF_CONNECTION &&
            ns.conn <= C_TEAR_DOWN && ns.conn >= C_TIMEOUT)
                drbd_thread_restart_nowait(&mdev->tconn->receiver);