md/raid6: avoid data corruption during recovery of double-degraded RAID6
authorNeilBrown <neilb@suse.de>
Tue, 12 Aug 2014 23:57:07 +0000 (09:57 +1000)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 17 Sep 2014 16:04:00 +0000 (09:04 -0700)
commit 9c4bdf697c39805078392d5ddbbba5ae5680e0dd upstream.

During recovery of a double-degraded RAID6 it is possible for
some blocks not to be recovered properly, leading to corruption.

If a write happens to one block in a stripe that would be written to a
missing device, and at the same time that stripe is recovering data
to the other missing device, then that recovered data may not be written.

This patch skips, in the double-degraded case, an optimisation that is
only safe for single-degraded arrays.

Bug was introduced in 2.6.32 and fix is suitable for any kernel since
then.  In an older kernel with separate handle_stripe5() and
handle_stripe6() functions the patch must change handle_stripe6().

Fixes: 6c0069c0ae9659e3a91b68eaed06a5c6c37f45c8
Cc: Yuri Tikhonov <yur@emcraft.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reported-by: "Manibalan P" <pmanibalan@amiindia.co.in>
Tested-by: "Manibalan P" <pmanibalan@amiindia.co.in>
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1090423
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/md/raid5.c

index 5e3c25d4562c2e497d8b50d6c21d9dfd75e2bdb4..774f81423d7808d7f4223e7574d09cc3eab1c6b6 100644 (file)
@@ -3561,6 +3561,8 @@ static void handle_stripe(struct stripe_head *sh)
                                set_bit(R5_Wantwrite, &dev->flags);
                                if (prexor)
                                        continue;
+                               if (s.failed > 1)
+                                       continue;
                                if (!test_bit(R5_Insync, &dev->flags) ||
                                    ((i == sh->pd_idx || i == sh->qd_idx)  &&
                                     s.failed == 0))