drbd: fix a race between start_resync and send_and_submit
authorLars Ellenberg <lars@linbit.com>
Mon, 28 Apr 2014 16:43:26 +0000 (18:43 +0200)
committerJens Axboe <axboe@fb.com>
Wed, 30 Apr 2014 19:46:55 +0000 (13:46 -0600)
In the drbd make request function, specifically in
drbd_send_and_submit(), we decide whether we want to send the actual
write request, or only a "set this block out of sync" information.

We do so based on the current connection state, while holding the req_lock.
The connection state is not supposed to change while holding the req_lock.

But in drbd_start_resync, we did change that state anyways,
while only holding the global_state_lock, which is enough to change
sync-after dependencies (paused vs active resync), but
not good enough to change the connection state.

Fix: in drbd_start_resync, first grab the req_lock to serialize with
drbd_send_and_submit(), before grabbing the global_state_lock
to be able to evaluate the sync-after dependencies.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
drivers/block/drbd/drbd_worker.c

index 26338bedb25b71cb67e8bd342e8b8d05c7862252..34dde10fae48735b49cfc3434a8f109bc468cc52 100644 (file)
@@ -1686,11 +1686,15 @@ void drbd_start_resync(struct drbd_device *device, enum drbd_conns side)
        }
        clear_bit(B_RS_H_DONE, &device->flags);
 
-       write_lock_irq(&global_state_lock);
+       /* req_lock: serialize with drbd_send_and_submit() and others
+        * global_state_lock: for stable sync-after dependencies */
+       spin_lock_irq(&device->resource->req_lock);
+       write_lock(&global_state_lock);
        /* Did some connection breakage or IO error race with us? */
        if (device->state.conn < C_CONNECTED
        || !get_ldev_if_state(device, D_NEGOTIATING)) {
-               write_unlock_irq(&global_state_lock);
+               write_unlock(&global_state_lock);
+               spin_unlock_irq(&device->resource->req_lock);
                mutex_unlock(device->state_mutex);
                return;
        }
@@ -1730,7 +1734,8 @@ void drbd_start_resync(struct drbd_device *device, enum drbd_conns side)
                }
                _drbd_pause_after(device);
        }
-       write_unlock_irq(&global_state_lock);
+       write_unlock(&global_state_lock);
+       spin_unlock_irq(&device->resource->req_lock);
 
        if (r == SS_SUCCESS) {
                /* reset rs_last_bcast when a resync or verify is started,