md/raid5: fix long-standing problem with bitmap handling on write failure.
authorNeilBrown <neilb@suse.de>
Wed, 15 Jan 2014 22:35:38 +0000 (09:35 +1100)
committerNeilBrown <neilb@suse.de>
Wed, 15 Jan 2014 22:35:38 +0000 (09:35 +1100)
Before a write starts we set a bit in the write-intent bitmap.
When the write completes we clear that bit if the write was successful
to all devices.  However if the write wasn't fully successful we
should not clear the bit.  If the faulty drive is subsequently
re-added, the fact that the bit is still set ensure that we will
re-write the data that is missing.

This logic is mediated by the STRIPE_DEGRADED flag - we only clear the
bitmap bit when this flag is not set.
Currently we correctly set the flag if a write starts when some
devices are failed or missing.  But we do *not* set the flag if some
device failed during the write attempt.
This is wrong and can result in clearing the bit inappropriately.

So: set the flag when a write fails.

This bug has been present since bitmaps were introduces, so the fix is
suitable for any -stable kernel.

Reported-by: Ethan Wilson <ethan.wilson@shiftmail.org>
Cc: stable@vger.kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
drivers/md/raid5.c

index cbb15716a5db31cf39a08b6a8a9df45d0bc9e0dc..3088d3af5a896d35c669a7489d880fa9456253ab 100644 (file)
@@ -2111,6 +2111,7 @@ static void raid5_end_write_request(struct bio *bi, int error)
                        set_bit(R5_MadeGoodRepl, &sh->dev[i].flags);
        } else {
                if (!uptodate) {
+                       set_bit(STRIPE_DEGRADED, &sh->state);
                        set_bit(WriteErrorSeen, &rdev->flags);
                        set_bit(R5_WriteError, &sh->dev[i].flags);
                        if (!test_and_set_bit(WantReplacement, &rdev->flags))