[DLM] confirm master for recovered waiting requests
authorDavid Teigland <teigland@redhat.com>
Fri, 8 Sep 2006 13:36:35 +0000 (08:36 -0500)
committerSteven Whitehouse <swhiteho@redhat.com>
Fri, 8 Sep 2006 21:00:12 +0000 (17:00 -0400)
Fixing the following scenario:
- A request is on the waiters list waiting for a reply from a remote node.
- The request is the first one on the resource, so first_lkid is set.
- The remote node fails causing recovery.
- During recovery the requesting node becomes master.
- The request is now processed locally instead of being a remote operation.
- At this point we need to call confirm_master() on the resource since
  we're certain we're now the master node.  This will clear first_lkid.
- We weren't calling confirm_master(), so first_lkid was not being cleared
  causing subsequent requests on that resource to get stuck.

Signed-off-by: David Teigland <teigland@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
fs/dlm/lock.c

index 67247f0b508a5fe2de76c90283baff19c54cff24..af2f2f01bd5fd423e8381ee6ceca01d2d100b58b 100644 (file)
@@ -3283,6 +3283,8 @@ int dlm_recover_waiters_post(struct dlm_ls *ls)
                        hold_rsb(r);
                        lock_rsb(r);
                        _request_lock(r, lkb);
+                       if (is_master(r))
+                               confirm_master(r, 0);
                        unlock_rsb(r);
                        put_rsb(r);
                        break;