md/raid10: attempt to fix read errors during resync/check
authorNeilBrown <neilb@suse.de>
Thu, 28 Jul 2011 01:39:25 +0000 (11:39 +1000)
committerNeilBrown <neilb@suse.de>
Thu, 28 Jul 2011 01:39:25 +0000 (11:39 +1000)
We already attempt to fix read errors found during normal IO
and a 'repair' process.
It is best to try to repair them at any time they are found,
so move a test so that during sync and check a read error will
be corrected by over-writing with good data.

If both (all) devices have known bad blocks in the sync section we
won't try to fix even though the bad blocks might not overlap.  That
should be considered later.

Also if we hit a read error during recovery we don't try to fix it.
It would only be possible to fix if there were at least three copies
of data, which is not very common with RAID10.  But it should still
be considered later.

Signed-off-by: NeilBrown <neilb@suse.de>
drivers/md/raid10.c

index 909450414c67cf192c952cdd923cf4190cae7254..10415ddfcb420e33f8597adc2f55091e2323d228 100644 (file)
@@ -1541,11 +1541,12 @@ static void sync_request_write(mddev_t *mddev, r10bio_t *r10_bio)
                        if (j == vcnt)
                                continue;
                        mddev->resync_mismatches += r10_bio->sectors;
+                       if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery))
+                               /* Don't fix anything. */
+                               continue;
                }
-               if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery))
-                       /* Don't fix anything. */
-                       continue;
-               /* Ok, we need to write this bio
+               /* Ok, we need to write this bio, either to correct an
+                * inconsistency or to correct an unreadable block.
                 * First we need to fixup bv_offset, bv_len and
                 * bi_vecs, as the read request might have corrupted these
                 */