Hi Linux RAID team, I'm encountering an issue when assembling an existing raid6 md software raid array with consistency policy journal. The mdadm assemble process (the write to md/array_state) seems to hang indefinitely in kernel. Setup: kernel: Linux version 6.13.9-061309-generic distro: ubuntu 24.04 mdadm: v4.3 - 2024-02-15 - Ubuntu 4.3-1ubuntu2.1 raid disks: 10x 16TB spinning rust journal device: 100G lvm2 lv on md raid1 array The issue occurs during incremental assembly by udev / systemd as well as when assembling the array manually after boot via: mdadm --assemble /dev/md123 /dev/disk/by-partlabel/hdd-raid6-? /dev/disk/ssd_raid1/hdd-raid6-journal The problem initially occured after rebooting (from kernel 6.8.0-57-generic (ubuntu) to kernel 6.11 (ubuntu 24.04.2 hwe). Before the reboot the array was running for weeks with stripe_cache_size=32768 and journal_mode=writeback. The shutdown may have hung during the initial shutdown as well, but I do not have any logs for the shutdown, On assemble the array seems to be written to for a while 10-20 seconds (I am guessing the journal is replayed) and than activity completely stops. During the replay the arrays stripe_cache_size is automatically raised to 32768. Afterwards its not possibly to create any new md arrays (only tried mdadm create for now) even on unrelated backing devices. /proc/mdadm / mdadm --detail show the array in failed state with 10 spare devices. Superblocks seem to still be consistent and not updated and show the array with 10 active devices + journal. When the hang occurs the state of the journal is as follows: * 160525 valid journal metablocks including journal tail (currently pointed to by superblock) * containing 2,347,779 4k data blocks * 313,931 2*4k parity blocks * 307,281 flushes May be able to test on 6.14.2 on snapshots of the disks tomorrow and should be able provide the device superblocks and the part of the journal device currently pointed to by journal_tail (or any other missing information) if requested. Same problem occurs on 6.8.0-57-generic (ubuntu), 6.11.0-21-generic (ubuntu) as well as 6.13.9-061309-generic (mainline). Dmesg / stacktrace for the latter is attached below: > mdadm --assemble /dev/md123 /dev/disk/by-partlabel/hdd-raid6-? /dev/disk/ssd_raid1/hdd-raid6-journal [ 2260.913878] kernel: md: md123 stopped. [ 2261.229345] kernel: md/raid:md123: device sdd1 operational as raid disk 0 [ 2261.229351] kernel: md/raid:md123: device sdf1 operational as raid disk 9 [ 2261.229354] kernel: md/raid:md123: device sdl1 operational as raid disk 8 [ 2261.229356] kernel: md/raid:md123: device sdk1 operational as raid disk 7 [ 2261.229358] kernel: md/raid:md123: device sdi1 operational as raid disk 6 [ 2261.229360] kernel: md/raid:md123: device sdj1 operational as raid disk 5 [ 2261.229362] kernel: md/raid:md123: device sdc1 operational as raid disk 4 [ 2261.229364] kernel: md/raid:md123: device sdg1 operational as raid disk 3 [ 2261.229366] kernel: md/raid:md123: device sdh1 operational as raid disk 2 [ 2261.229368] kernel: md/raid:md123: device sde1 operational as raid disk 1 [ 2261.230770] kernel: md/raid:md123: raid level 6 active with 10 out of 10 devices, algorithm 2 [ 2459.072115] kernel: INFO: task mdadm:11089 blocked for more than 122 seconds. [ 2459.072514] kernel: Not tainted 6.13.9-061309-generic #202503282144 [ 2459.072526] kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 2459.072538] kernel: task:mdadm state:D stack:0 pid:11089 tgid:11089 ppid:11088 flags:0x00004002 [ 2459.072544] kernel: Call Trace: [ 2459.072546] kernel: <TASK> [ 2459.072551] kernel: __schedule+0x2b8/0x630 [ 2459.072558] kernel: schedule+0x29/0xd0 [ 2459.072562] kernel: raid5_get_active_stripe+0x277/0x300 [raid456] [ 2459.072571] kernel: ? __pfx_autoremove_wake_function+0x10/0x10 [ 2459.072575] kernel: r5c_recovery_analyze_meta_block+0x5e6/0x690 [raid456] [ 2459.072583] kernel: r5c_recovery_flush_log+0xc5/0x250 [raid456] [ 2459.072589] kernel: r5l_recovery_log+0x118/0x260 [raid456] [ 2459.072594] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072600] kernel: r5l_load_log+0x1dd/0x240 [raid456] [ 2459.072606] kernel: r5l_start+0x1d/0x90 [raid456] [ 2459.072612] kernel: raid5_start+0x18/0x20 [raid456] [ 2459.072617] kernel: md_start+0x32/0x60 [ 2459.072620] kernel: do_md_run+0x7c/0x120 [ 2459.072624] kernel: array_state_store+0x3e8/0x470 [ 2459.072628] kernel: md_attr_store+0x8e/0x100 [ 2459.072633] kernel: sysfs_kf_write+0x3e/0x60 [ 2459.072636] kernel: kernfs_fop_write_iter+0x14c/0x1f0 [ 2459.072640] kernel: vfs_write+0x29c/0x460 [ 2459.072647] kernel: ksys_write+0x70/0xf0 [ 2459.072651] kernel: __x64_sys_write+0x19/0x30 [ 2459.072654] kernel: x64_sys_call+0x2a3/0x2310 [ 2459.072659] kernel: do_syscall_64+0x7e/0x170 [ 2459.072664] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072667] kernel: ? putname+0x60/0x80 [ 2459.072671] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072674] kernel: ? do_sys_openat2+0xa4/0xf0 [ 2459.072678] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072682] kernel: ? arch_exit_to_user_mode_prepare.isra.0+0x22/0xd0 [ 2459.072685] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072689] kernel: ? syscall_exit_to_user_mode+0x38/0x1d0 [ 2459.072692] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072695] kernel: ? do_syscall_64+0x8a/0x170 [ 2459.072699] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072702] kernel: ? arch_exit_to_user_mode_prepare.isra.0+0x22/0xd0 [ 2459.072705] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072708] kernel: ? syscall_exit_to_user_mode+0x38/0x1d0 [ 2459.072711] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072715] kernel: ? do_syscall_64+0x8a/0x170 [ 2459.072717] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072720] kernel: ? __wake_up+0x45/0x70 [ 2459.072724] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072727] kernel: ? md_wakeup_thread+0x54/0x90 [ 2459.072730] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072733] kernel: ? __mddev_resume+0x7f/0xa0 [ 2459.072736] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072739] kernel: ? md_ioctl+0x488/0x9e0 [ 2459.072743] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072746] kernel: ? rseq_get_rseq_cs+0x22/0x240 [ 2459.072750] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072753] kernel: ? rseq_ip_fixup+0x8d/0x1e0 [ 2459.072757] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072760] kernel: ? restore_fpregs_from_fpstate+0x3d/0xd0 [ 2459.072764] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072767] kernel: ? switch_fpu_return+0x4f/0xe0 [ 2459.072770] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072773] kernel: ? arch_exit_to_user_mode_prepare.isra.0+0xc8/0xd0 [ 2459.072776] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072779] kernel: ? syscall_exit_to_user_mode+0x38/0x1d0 [ 2459.072782] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072785] kernel: ? do_syscall_64+0x8a/0x170 [ 2459.072788] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072791] kernel: ? do_syscall_64+0x8a/0x170 [ 2459.072793] kernel: ? srso_alias_return_thunk+0x5/0xfbef5 [ 2459.072796] kernel: ? do_syscall_64+0x8a/0x170 [ 2459.072799] kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 2459.072802] kernel: RIP: 0033:0x77220111c574 [ 2459.072806] kernel: RSP: 002b:00007fff30657198 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 [ 2459.072810] kernel: RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 000077220111c574 [ 2459.072812] kernel: RDX: 0000000000000008 RSI: 000055fd6e331b8d RDI: 0000000000000005 [ 2459.072814] kernel: RBP: 00007fff30657240 R08: 0000000000000073 R09: 0000000000000000 [ 2459.072815] kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 000055fd6e331b8d [ 2459.072817] kernel: R13: 000055fda6505040 R14: 0000000000000000 R15: 000055fda6504c70 [ 2459.072823] kernel: </TASK>