Hi,
在 2025/04/08 15:49, Meir Elisha 写道:
During recovery/check operations, the process_checks function loops
through available disks to find a 'primary' source with successfully
read data.
If no suitable source disk is found after checking all possibilities,
the 'primary' index will reach conf->raid_disks * 2. Add an explicit
check for this condition after the loop. If no source disk was found,
print an error message and return early to prevent further processing
without a valid primary source.
Signed-off-by: Meir Elisha <meir.elisha@xxxxxxxxxxx>
---
This was observed when forcefully disconnecting all iSCSI devices backing
a RAID1 array(using --failfast flag) during a check operation, causing
all reads within process_checks to fail. The resulting kernel oops shows:
BUG: kernel NULL pointer dereference, address: 0000000000000040
RIP: 0010:process_checks+0x25e/0x5e0 [raid1]
Code: ... <4c> 8b 53 40 ... // mov r10,[rbx+0x40]
Call Trace:
process_checks
sync_request_write
raid1d
md_thread
kthread
ret_from_fork
drivers/md/raid1.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 0efc03cea24e..b6a52c137f53 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2296,6 +2296,12 @@ static void process_checks(struct r1bio *r1_bio)
rdev_dec_pending(conf->mirrors[primary].rdev, mddev);
break;
}
+
+ if (primary >= conf->raid_disks * 2) {
+ pr_err_ratelimited("md/raid1:%s: unable to find source disk\n", mdname(mddev));
+ return;
+ }
This means all reads failed, then I'm a bit confused why
sync_request_write() doesn't return early?
end_sync_read
if (!bio->bi_status)
// uptodate is not set
set_bit(R1BIO_Uptodate, &r1_bio->state);
sync_request_write
if (!test_bit(R1BIO_Uptodate, &r1_bio->state))
if (!fix_sync_read_error(r1_bio))
// why not return here?
return;
process_checks(r1_bio);
The answer is that fix_sync_read_error() is used for the case just one
rdev is read, it just handle the bio from r1_bio->read_disk, and suppose
that just one rdev is read. And if rdev_set_badblocks() succeed,
fix_sync_read_error() will clear R1BIO_Uptodate and bi_status and
finally cause this problem.
While in this case, all member disks are read for check/repair, hence I
think fix_sync_read_error() is not supposed to be called here, and
better solution will be:
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index d422bab77580..dafda9095c73 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2346,10 +2346,15 @@ static void sync_request_write(struct mddev
*mddev, struct r1bio *r1_bio)
int disks = conf->raid_disks * 2;
struct bio *wbio;
- if (!test_bit(R1BIO_Uptodate, &r1_bio->state))
- /* ouch - failed to read all of that. */
- if (!fix_sync_read_error(r1_bio))
+ if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) {
+ /*
+ * ouch - failed to read all of that. No need to fix
read error
+ * for check or repair because read all member disks failed.
+ */
+ if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) ||
+ !fix_sync_read_error(r1_bio))
return;
+ }
if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery))
process_checks(r1_bio);
Thanks,
Kuai
+
r1_bio->read_disk = primary;
for (i = 0; i < conf->raid_disks * 2; i++) {
int j = 0;