ocfs2: flush inode data to disk and free inode when i_count becomes zero
authorXue jiufei <xuejiufei@huawei.com>
Fri, 4 Sep 2015 22:44:11 +0000 (15:44 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Fri, 4 Sep 2015 23:54:41 +0000 (16:54 -0700)
Disk inode deletion may be heavily delayed when one node unlink a file
after the same dentry is freed on another node(say N1) because of memory
shrink but inode is left in memory.  This inode can only be freed while
N1 doing the orphan scan work.

However, N1 may skip orphan scan for several times because other nodes
may do the work earlier.  In our tests, it may take 1 hour on 4 nodes
cluster and it hurts the user experience.  So we think the inode should
be freed after the data flushed to disk when i_count becomes zero to
avoid such circumstances.

Signed-off-by: Joyce.xue <xuejiufei@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/ocfs2/inode.c

index b254416dc8d92d0fc1c66c0b2787e313de7712b5..4e69f3cbc5f182f5ba2a7dd89dcc286c7409f5bd 100644 (file)
@@ -1191,17 +1191,19 @@ void ocfs2_evict_inode(struct inode *inode)
 int ocfs2_drop_inode(struct inode *inode)
 {
        struct ocfs2_inode_info *oi = OCFS2_I(inode);
-       int res;
 
        trace_ocfs2_drop_inode((unsigned long long)oi->ip_blkno,
                                inode->i_nlink, oi->ip_flags);
 
-       if (oi->ip_flags & OCFS2_INODE_MAYBE_ORPHANED)
-               res = 1;
-       else
-               res = generic_drop_inode(inode);
+       assert_spin_locked(&inode->i_lock);
+       inode->i_state |= I_WILL_FREE;
+       spin_unlock(&inode->i_lock);
+       write_inode_now(inode, 1);
+       spin_lock(&inode->i_lock);
+       WARN_ON(inode->i_state & I_NEW);
+       inode->i_state &= ~I_WILL_FREE;
 
-       return res;
+       return 1;
 }
 
 /*