Kernel mistakenly "starts" resync on fully-degraded, newly-created raid10 array

Omari Stephens <xsdg@xxxxxxxx> · Sun, 22 Jun 2025 01:39:56 +0000

I tried asking on Reddit, and ended up resolving the issue myself:
https://www.reddit.com/r/linuxquestions/comments/1lh9to0/kernel_is_stuck_resyncing_a_4drive_raid10_array/

I run Debian SID, and am using kernel 6.12.32-amd64

#apt-cache policy linux-image-amd64
linux-image-amd64:
  Installed: 6.12.32-1
  Candidate: 6.12.32-1
  Version table:
 *** 6.12.32-1 500
        500 http://mirrors.kernel.org/debian unstable/main amd64 Packages
        500 http://http.us.debian.org/debian unstable/main amd64 Packages
        100 /var/lib/dpkg/status

#uname -r
6.12.32-amd64

To summarize the issue and my diagnostic steps, I ran this command to create a new raid10 array:

|#mdadm --create md13 --name=media --level=10 --layout=f2 -n 4 /dev/sdb1 missing /dev/sdf1 missing|

|At that point, /proc/mdstat showed the following, which makes no sense:|

md127 : active raid10 sdb1[2] sdc1[0]
      23382980608 blocks super 1.2 512K chunks 2 far-copies [4/2] [U_U_]
      [>....................]  resync =  0.0% (8594688/23382980608) finish=25176161501.3min speed=0K/sec
      bitmap: 175/175 pages [700KB], 65536KB chunk

With 2 drives present and 2 drives absent, the array can only start if the present drives are considered in sync.  The kernel spent most of a day in this state.  The
"8594688" count increased very slowly over time, but after 24 hours, it was only up to 0.1%.  During that time, I had mounted the array and transfered 11TB of data onto it.

Then when power-cycled, swapped SATA cables, and added the remaining drives, they were marked as spares and weren't added to the array (likely because the array was considered to be already resyncing):

#mdadm --detail /dev/md127
/dev/md127:
[...]
    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       -       0        0        1      removed
       2       8       17        2      active sync   /dev/sdb1
       -       0        0        3      removed

       4       8        1        -      spare   /dev/sda1
       5       8       65        -      spare   /dev/sde1

I ended up resolving the issue by recreating the array with --assume-clean:

#mdadm --create md19 --name=media3 --assume-clean --readonly --level=10 --layout=f2 -n 4 /dev/sdc1 missing /dev/sdb1 missing
To optimalize recovery speed, it is recommended to enable write-indent bitmap, do you want to enable it now? [y/N]? y
mdadm: /dev/sdc1 appears to be part of a raid array:
       level=raid10 devices=4 ctime=Sun Jun 22 00:51:33 2025
mdadm: /dev/sdb1 appears to be part of a raid array:
       level=raid10 devices=4 ctime=Sun Jun 22 00:51:33 2025
Continue creating array [y/N]? y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md/md19 started.

#cat /proc/mdstat
Personalities : [raid1] [raid10] [raid0] [raid6] [raid5] [raid4]
md127 : active (read-only) raid10 sdb1[2] sdc1[0]
      23382980608 blocks super 1.2 512K chunks 2 far-copies [4/2] [U_U_]
      bitmap: 175/175 pages [700KB], 65536KB chunk

At which point, I was able to add the new devices and have the array (start to) resync as expected:

#mdadm --manage /dev/md127 --add /dev/sda1 --add /dev/sde1
mdadm: added /dev/sda1
mdadm: added /dev/sde1

#cat /proc/mdstat
Personalities : [raid1] [raid10] [raid0] [raid6] [raid5] [raid4]
md127 : active raid10 sde1[5] sda1[4] sdc1[0] sdb1[2]
      23382980608 blocks super 1.2 512K chunks 2 far-copies [4/2] [U_U_]
      [>....................]  recovery =  0.0% (714112/11691490304) finish=1091.3min speed=178528K/sec
      bitmap: 0/175 pages [0KB], 65536KB chunk

#mdadm --detail /dev/md127
/dev/md127:
[...]
    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       5       8       65        1      spare rebuilding   /dev/sde1
       2       8       17        2      active sync   /dev/sdb1
       4       8        1        3      spare rebuilding   /dev/sda1

--xsdg