xfs: don't break from growfs ag update loop on error
authorEric Sandeen <sandeen@sandeen.net>
Fri, 11 Oct 2013 19:14:05 +0000 (14:14 -0500)
committerBen Myers <bpm@sgi.com>
Thu, 17 Oct 2013 18:31:42 +0000 (13:31 -0500)
When xfs_growfs_data_private() is updating backup superblocks,
it bails out on the first error encountered, whether reading or
writing:

* If we get an error writing out the alternate superblocks,
* just issue a warning and continue.  The real work is
* already done and committed.

This can cause a problem later during repair, because repair
looks at all superblocks, and picks the most prevalent one
as correct.  If we bail out early in the backup superblock
loop, we can end up with more "bad" matching superblocks than
good, and a post-growfs repair may revert the filesystem to
the old geometry.

With the combination of superblock verifiers and old bugs,
we're more likely to encounter read errors due to verification.

And perhaps even worse, we don't even properly write any of the
newly-added superblocks in the new AGs.

Even with this change, growfs will still say:

  xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Structure needs cleaning
  data blocks changed from 319815680 to 335216640

which might be confusing to the user, but it at least communicates
that something has gone wrong, and dmesg will probably highlight
the need for an xfs_repair.

And this is still best-effort; if verifiers fail on more than
half the backup supers, they may still "win" - but that's probably
best left to repair to more gracefully handle by doing its own
strict verification as part of the backup super "voting."

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
fs/xfs/xfs_fsops.c

index fdae4ec5f21b7874f9d7eae4228e532d3dbf1fe2..76c7b2b4fa8dab6b1307899aa81f24ed3a1b37f4 100644 (file)
@@ -155,7 +155,7 @@ xfs_growfs_data_private(
        xfs_buf_t               *bp;
        int                     bucket;
        int                     dpct;
-       int                     error;
+       int                     error, saved_error = 0;
        xfs_agnumber_t          nagcount;
        xfs_agnumber_t          nagimax = 0;
        xfs_rfsblock_t          nb, nb_mod;
@@ -498,29 +498,33 @@ xfs_growfs_data_private(
                                error = ENOMEM;
                }
 
+               /*
+                * If we get an error reading or writing alternate superblocks,
+                * continue.  xfs_repair chooses the "best" superblock based
+                * on most matches; if we break early, we'll leave more
+                * superblocks un-updated than updated, and xfs_repair may
+                * pick them over the properly-updated primary.
+                */
                if (error) {
                        xfs_warn(mp,
                "error %d reading secondary superblock for ag %d",
                                error, agno);
-                       break;
+                       saved_error = error;
+                       continue;
                }
                xfs_sb_to_disk(XFS_BUF_TO_SBP(bp), &mp->m_sb, XFS_SB_ALL_BITS);
 
-               /*
-                * If we get an error writing out the alternate superblocks,
-                * just issue a warning and continue.  The real work is
-                * already done and committed.
-                */
                error = xfs_bwrite(bp);
                xfs_buf_relse(bp);
                if (error) {
                        xfs_warn(mp,
                "write error %d updating secondary superblock for ag %d",
                                error, agno);
-                       break; /* no point in continuing */
+                       saved_error = error;
+                       continue;
                }
        }
-       return error;
+       return saved_error ? saved_error : error;
 
  error0:
        xfs_trans_cancel(tp, XFS_TRANS_ABORT);