During recovery/check operations, the process_checks function loops through available disks to find a 'primary' source with successfully read data. If no suitable source disk is found after checking all possibilities, the 'primary' index will reach conf->raid_disks * 2. Add an explicit check for this condition after the loop. If no source disk was found, print an error message and return early to prevent further processing without a valid primary source. Signed-off-by: Meir Elisha <meir.elisha@xxxxxxxxxxx> --- Changes from v1: - Don't fix read errors on recovery/check This was observed when forcefully disconnecting all iSCSI devices backing a RAID1 array(using --failfast flag) during a check operation, causing all reads within process_checks to fail. The resulting kernel oops shows: BUG: kernel NULL pointer dereference, address: 0000000000000040 RIP: 0010:process_checks+0x25e/0x5e0 [raid1] Code: ... <4c> 8b 53 40 ... // mov r10,[rbx+0x40] Call Trace: process_checks sync_request_write raid1d md_thread kthread ret_from_fork drivers/md/raid1.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 0efc03cea24e..de9bccbe7337 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2200,14 +2200,9 @@ static int fix_sync_read_error(struct r1bio *r1_bio) if (!rdev_set_badblocks(rdev, sect, s, 0)) abort = 1; } - if (abort) { - conf->recovery_disabled = - mddev->recovery_disabled; - set_bit(MD_RECOVERY_INTR, &mddev->recovery); - md_done_sync(mddev, r1_bio->sectors, 0); - put_buf(r1_bio); + if (abort) return 0; - } + /* Try next page */ sectors -= s; sect += s; @@ -2346,10 +2341,21 @@ static void sync_request_write(struct mddev *mddev, struct r1bio *r1_bio) int disks = conf->raid_disks * 2; struct bio *wbio; - if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) - /* ouch - failed to read all of that. */ - if (!fix_sync_read_error(r1_bio)) + if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) { + /* + * ouch - failed to read all of that. + * No need to fix read error for check/repair + * because all member disks are read. + */ + if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) || + !fix_sync_read_error(r1_bio)) { + conf->recovery_disabled = mddev->recovery_disabled; + set_bit(MD_RECOVERY_INTR, &mddev->recovery); + md_done_sync(mddev, r1_bio->sectors, 0); + put_buf(r1_bio); return; + } + } if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) process_checks(r1_bio); -- 2.34.1