md: fix bug with re-adding of partially recovered device.
authorNeilBrown <neilb@suse.de>
Thu, 9 Dec 2010 05:36:28 +0000 (16:36 +1100)
committerNeilBrown <neilb@suse.de>
Thu, 9 Dec 2010 05:36:28 +0000 (16:36 +1100)
With v0.90 metadata, a hot-spare does not become a full member of the
array until recovery is complete.  So if we re-add such a device to
the array, we know that all of it is as up-to-date as the event count
would suggest, and so it a bitmap-based recovery is possible.

However with v1.x metadata, the hot-spare immediately becomes a full
member of the array, but it record how much of the device has been
recovered.  If the array is stopped and re-assembled recovery starts
from this point.

When such a device is hot-added to an array we currently lose the 'how
much is recovered' information and incorrectly included it as a full
in-sync member (after bitmap-based fixup).
This is wrong and unsafe and could corrupt data.

So be more careful about setting saved_raid_disk - which is what
guides the re-adding of devices back into an array.
The new code matches the code in slot_store which does a similar
thing, which is encouraging.

This is suitable for any -stable kernel.

Reported-by: "Dailey, Nate" <Nate.Dailey@stratus.com>
Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
drivers/md/md.c

index d66aaeddf95d7c14d16001a29e6e940c7adb5e6a..b757da1751805c72e79880af1f5bec83deb47558 100644 (file)
@@ -5159,7 +5159,7 @@ static int add_new_disk(mddev_t * mddev, mdu_disk_info_t *info)
                                PTR_ERR(rdev));
                        return PTR_ERR(rdev);
                }
-               /* set save_raid_disk if appropriate */
+               /* set saved_raid_disk if appropriate */
                if (!mddev->persistent) {
                        if (info->state & (1<<MD_DISK_SYNC)  &&
                            info->raid_disk < mddev->raid_disks)
@@ -5169,7 +5169,10 @@ static int add_new_disk(mddev_t * mddev, mdu_disk_info_t *info)
                } else
                        super_types[mddev->major_version].
                                validate_super(mddev, rdev);
-               rdev->saved_raid_disk = rdev->raid_disk;
+               if (test_bit(In_sync, &rdev->flags))
+                       rdev->saved_raid_disk = rdev->raid_disk;
+               else
+                       rdev->saved_raid_disk = -1;
 
                clear_bit(In_sync, &rdev->flags); /* just to be sure */
                if (info->state & (1<<MD_DISK_WRITEMOSTLY))