This bug has been present ever since data-check was introduce
in 2.6.16. However it would only fire if a data-check were
done on a degraded array, which was only possible if the array
has 3 or more devices. This is certainly possible, but is quite
uncommon.
Since hot-replace was added in 3.3 it can happen more often as
the same condition can arise if not all possible replacements are
present.
The problem is that as soon as we submit the last read request, the
'r1_bio' structure could be freed at any time, so we really should
stop looking at it. If the last device is being read from we will
stop looking at it. However if the last device is not due to be read
from, we will still check the bio pointer in the r1_bio, but the
r1_bio might already be free.
So use the read_targets counter to make sure we stop looking for bios
to submit as soon as we have submitted them all.
This fix is suitable for any -stable kernel since 2.6.16.
Cc: stable@vger.kernel.org
Reported-by: Arnold Schulz <arnysch@gmx.net>
Signed-off-by: NeilBrown <neilb@suse.de>
*/
if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
atomic_set(&r1_bio->remaining, read_targets);
- for (i = 0; i < conf->raid_disks * 2; i++) {
+ for (i = 0; i < conf->raid_disks * 2 && read_targets; i++) {
bio = r1_bio->bios[i];
if (bio->bi_end_io == end_sync_read) {
+ read_targets--;
md_sync_acct(bio->bi_bdev, nr_sectors);
generic_make_request(bio);
}