I've hit a lockdep splat with generic/270 test complaining that:
3216.fsstress.b/3533 is trying to acquire lock:
(jbd2_handle){++++..}, at: [<
ffffffff813152e0>] jbd2_log_wait_commit+0x0/0x150
but task is already holding lock:
(jbd2_handle){++++..}, at: [<
ffffffff8130bd3b>] start_this_handle+0x35b/0x850
The underlying problem is that jbd2_journal_force_commit_nested()
(called from ext4_should_retry_alloc()) may get called while a
transaction handle is started. In such case it takes care to not wait
for commit of the running transaction (which would deadlock) but only
for a commit of a transaction that is already committing (which is safe
as that doesn't wait for any filesystem locks).
In fact there are also other callers of jbd2_log_wait_commit() that take
care to pass tid of a transaction that is already committing and for
those cases, the lockdep instrumentation is too restrictive and leading
to false positive reports. Fix the problem by calling
jbd2_might_wait_for_commit() from jbd2_log_wait_commit() only if the
transaction isn't already committing.
Fixes:
1eaa566d368b214d99cbb973647c1b0b8102a9ae
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
{
int err = 0;
- jbd2_might_wait_for_commit(journal);
read_lock(&journal->j_state_lock);
+#ifdef CONFIG_PROVE_LOCKING
+ /*
+ * Some callers make sure transaction is already committing and in that
+ * case we cannot block on open handles anymore. So don't warn in that
+ * case.
+ */
+ if (tid_gt(tid, journal->j_commit_sequence) &&
+ (!journal->j_committing_transaction ||
+ journal->j_committing_transaction->t_tid != tid)) {
+ read_unlock(&journal->j_state_lock);
+ jbd2_might_wait_for_commit(journal);
+ read_lock(&journal->j_state_lock);
+ }
+#endif
#ifdef CONFIG_JBD2_DEBUG
if (!tid_geq(journal->j_commit_request, tid)) {
printk(KERN_ERR