On 2025/09/02 23:46, Davidlohr Bueso wrote: > On Mon, 01 Sep 2025, syzbot wrote: > >> syzbot has bisected this issue to: >> >> commit 5b67d43976828dea2394eae2556b369bb7a61f64 >> Author: Davidlohr Bueso <dave@xxxxxxxxxxxx> >> Date: Fri Apr 18 01:59:17 2025 +0000 >> >> fs/buffer: use sleeping version of __find_get_block() > > I don't think this bisection is right, considering this issue was first > triggered last year (per the dashboard). I think this bisection is not bogus; at least that commit made this problem easily triggerable enough to find a reproducer... What is common to this report is that deactivate_super() is blocked waiting for hfs_sync_fs() to complete and release sb->s_umount lock. Current sample crash report (shown below) tells us that PID = 5962 (who is trying to hold for write) is blocked inside deactivate_super() waiting for PID = 6254 (who is already holding for read) to release sb->s_umount lock. But since PID = 6254 is blocked at io_schedule(), PID = 6254 can't release sb->s_umount lock. The question is why PID = 6254 is blocked for two minutes waiting for io_schedule() to complete. I suspect that commit 5b67d4397682 is relevant, for that commit has changed the behavior of bdev_getblk() which PID = 6254 is blocked. Some method for reporting what is happening (e.g. report details when folio_lock() is blocked for more than 10 seconds) is wanted. Of course, it is possible that a corrupted hfs filesystem image is leading to an infinite loop... INFO: task syz-executor:5962 blocked for more than 143 seconds. Not tainted syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:21832 pid:5962 tgid:5962 ppid:1 task_flags:0x400140 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5357 [inline] __schedule+0x16f3/0x4c20 kernel/sched/core.c:6961 __schedule_loop kernel/sched/core.c:7043 [inline] rt_mutex_schedule+0x77/0xf0 kernel/sched/core.c:7339 rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272 __super_lock fs/super.c:57 [inline] __super_lock_excl fs/super.c:72 [inline] deactivate_super+0xa9/0xe0 fs/super.c:506 cleanup_mnt+0x425/0x4c0 fs/namespace.c:1375 task_work_run+0x1d4/0x260 kernel/task_work.c:227 exit_to_user_mode_loop+0[ 309.321754][ T38] resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] exit_to_user_mode_loop+0[ 309.321754][ T38] exit_to_user_mode_loop+0xec/0x110 kernel/entry/common.c:43 exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline] syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline] syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline] do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff4a4aaff17 RSP: 002b:00007ffe8b16a008 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 00007ff4a4b31c05 RCX: 00007ff4a4aaff17 RDX: 0000000000000000 RSI: 0000000000000009 RDI: 00007ffe8b16a0c0 RBP: 00007ffe8b16a0c0 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000ffffffff R11: 0000000000000246 R12: 00007ffe8b16b150 R13: 00007ff4a4b31c05 R14: 00000000000257d4 R15: 00007ffe8b16b190 </TASK> 1 lock held by syz-executor/5962: #0: ffff88803976c0d0 (&type->s_umount_key#72){++++}-{4:4}, at: __super_lock fs/super.c:57 [inline] #0: ffff88803976c0d0 (&type->s_umount_key#72){++++}-{4:4}, at: __super_lock_excl fs/super.c:72 [inline] #0: ffff88803976c0d0 (&type->s_umount_key#72){++++}-{4:4}, at: deactivate_super+0xa9/0xe0 fs/super.c:506 INFO: task syz.4.168:6254 blocked for more than 143 seconds. Not tainted syzkaller #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz.4.168 state:D stack:25800 pid:6254 tgid:6254 ppid:5967 task_flags:0x400140 flags:0x00004004 Call Trace: <TASK> context_switch kernel/sched/core.c:5357 [inline] __schedule+0x16f3/0x4c20 kernel/sched/core.c:6961 __schedule_loop kernel/sched/core.c:7043 [inline] schedule+0x165/0x360 kernel/sched/core.c:7058 io_schedule+0x81/0xe0 kernel/sched/core.c:7903 folio_wait_bit_common+0x6b5/0xb90 mm/filemap.c:1317 folio_lock include/linux/pagemap.h:1133 [inline] __find_get_block_slow fs/buffer.c:205 [inline] find_get_block_common+0x2e6/0xfc0 fs/buffer.c:1408 bdev_getblk+0x4b/0x660 fs/buffer.c:-1 __bread_gfp+0x89/0x3c0 fs/buffer.c:1515 sb_bread include/linux/buffer_head.h:346 [inline] hfs_mdb_commit+0xa42/0x1160 fs/hfs/mdb.c:318 hfs_sync_fs+0x15/0x20 fs/hfs/super.c:37 __iterate_supers+0x13a/0x290 fs/super.c:924 ksys_sync+0xa3/0x150 fs/sync.c:103 __ia32_sys_sync+0xe/0x20 fs/sync.c:113 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f35c0abebe9 RSP: 002b:00007fff821c57b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2 RAX: ffffffffffffffda RBX: 00007f35c0cf5fa0 RCX: 00007f35c0abebe9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007f35c0cf5fa0 R14: 00007f35c0cf5fa0 R15: 0000000000000000 </TASK> 1 lock held by syz.4.168/6254: #0: ffff88803976c0d0 (&type->s_umount_key#72){++++}-{4:4}, at: __super_lock fs/super.c:59 [inline] #0: ffff88803976c0d0 (&type->s_umount_key#72){++++}-{4:4}, at: super_lock+0x2a9/0x3b0 fs/super.c:121