On Apr 25, 2025 / 09:37, Ming Lei wrote: > ublk_cancel_cmd() calls io_uring_cmd_done() to complete uring_cmd, but > we may have scheduled task work via io_uring_cmd_complete_in_task() for > dispatching request, then kernel crash can be triggered. > > Fix it by not trying to canceling the command if ublk block request is > started. I found that the blktests test case ublk/002 often hangs with the recent v6.15-rcX kernel tags with the INFO at iou-wrk-X [1]. The hang is recreated in stable manner when I repeat the test case a few times. I bisected and this patch as the commit f40139fde527 triggers the hang. When I reverted the commit from the kernel v6.15-rc6, the hang disappeared. (I repeated ublk/002 20 times, and observed no hang.) The hang was observed with test systems with Fedora 42. I do not observe the hang with Fedora 41, but not sure why. liburing version difference could be the reason (v2.6 for Fedora 41, v2.9 for Fedora 42). Actions for fix will be appreciated. [1] [ 4497.777695] [ T130863] run blktests ublk/002 at 2025-05-07 14:48:32 [ 4499.983130] [ T67084] blk_print_req_error: 58 callbacks suppressed [ 4499.983136] [ T67084] I/O error, dev ublkb0, sector 106830432 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0 [ 4499.985230] [ T67084] I/O error, dev ublkb0, sector 165364248 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0 [ 4499.998194] [ T57173] I/O error, dev ublkb0, sector 251279584 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0 [ 4671.208372] [ T46] INFO: task iou-wrk-131061:131062 blocked for more than 122 seconds. [ 4671.209417] [ T46] Not tainted 6.15.0-rc5 #317 [ 4671.210182] [ T46] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4671.211194] [ T46] task:iou-wrk-131061 state:D stack:0 pid:131062 tgid:131061 ppid:131060 task_flags:0x404150 flags:0x00004000 [ 4671.212573] [ T46] Call Trace: [ 4671.213364] [ T46] <TASK> [ 4671.214047] [ T46] __schedule+0xf8c/0x5c80 [ 4671.214920] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.215831] [ T46] ? __mutex_lock+0x2d9/0x1b50 [ 4671.216760] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.217718] [ T46] ? lock_release+0x244/0x300 [ 4671.218621] [ T46] ? lock_release+0x244/0x300 [ 4671.219521] [ T46] ? __pfx___schedule+0x10/0x10 [ 4671.220433] [ T46] ? do_raw_spin_lock+0x124/0x260 [ 4671.221425] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.222292] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.223121] [ T46] ? lock_release+0x244/0x300 [ 4671.223984] [ T46] ? lock_release+0x244/0x300 [ 4671.224844] [ T46] schedule+0xda/0x390 [ 4671.225667] [ T46] blk_mq_freeze_queue_wait+0x121/0x170 [ 4671.226653] [ T46] ? __pfx_blk_mq_freeze_queue_wait+0x10/0x10 [ 4671.227656] [ T46] ? __pfx_autoremove_wake_function+0x10/0x10 [ 4671.228656] [ T46] ? trace_hardirqs_on+0x14/0x150 [ 4671.229564] [ T46] ? _raw_spin_unlock_irq+0x34/0x50 [ 4671.230484] [ T46] del_gendisk+0x4f9/0xa20 [ 4671.231320] [ T46] ? ublk_stop_dev+0x24/0x200 [ublk_drv] [ 4671.232379] [ T46] ? __pfx_del_gendisk+0x10/0x10 [ 4671.233197] [ T46] ? __pfx___mutex_lock+0x10/0x10 [ 4671.234094] [ T46] ublk_stop_dev_unlocked.part.0+0xb6/0x580 [ublk_drv] [ 4671.235153] [ T46] ? __pfx_ublk_stop_dev_unlocked.part.0+0x10/0x10 [ublk_drv] [ 4671.236303] [ T46] ? kvm_sched_clock_read+0xd/0x20 [ 4671.237156] [ T46] ublk_stop_dev+0x62/0x200 [ublk_drv] [ 4671.238081] [ T46] ? security_capable+0x79/0x110 [ 4671.238964] [ T46] ublk_ctrl_uring_cmd+0xe87/0x2a70 [ublk_drv] [ 4671.239967] [ T46] ? __pfx_avc_has_perm+0x10/0x10 [ 4671.240858] [ T46] ? lock_release+0x244/0x300 [ 4671.241759] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.242677] [ T46] ? __pfx_ublk_ctrl_uring_cmd+0x10/0x10 [ublk_drv] [ 4671.243751] [ T46] ? finish_task_switch.isra.0+0x202/0x850 [ 4671.244728] [ T46] ? selinux_uring_cmd+0x1c3/0x280 [ 4671.245623] [ T46] ? __schedule+0x2f21/0x5c80 [ 4671.246504] [ T46] ? __pfx_selinux_uring_cmd+0x10/0x10 [ 4671.247438] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.248228] [ T46] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 4671.249109] [ T46] io_uring_cmd+0x221/0x3c0 [ 4671.249952] [ T46] io_issue_sqe+0x6ab/0x1800 [ 4671.250767] [ T46] ? trace_hardirqs_on+0x14/0x150 [ 4671.251634] [ T46] ? __pfx_io_issue_sqe+0x10/0x10 [ 4671.252502] [ T46] ? __timer_delete_sync+0x16c/0x270 [ 4671.253392] [ T46] ? __pfx___timer_delete_sync+0x10/0x10 [ 4671.254402] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.255162] [ T46] io_wq_submit_work+0x3ea/0xf70 [ 4671.255990] [ T46] io_worker_handle_work+0x615/0x1280 [ 4671.256854] [ T46] io_wq_worker+0x2ee/0xbd0 [ 4671.257670] [ T46] ? __pfx_io_wq_worker+0x10/0x10 [ 4671.258491] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.259285] [ T46] ? lock_acquire+0x2af/0x310 [ 4671.260035] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.260800] [ T46] ? do_raw_spin_lock+0x124/0x260 [ 4671.261604] [ T46] ? lock_acquire+0x2af/0x310 [ 4671.262376] [ T46] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 4671.263186] [ T46] ? lock_release+0x244/0x300 [ 4671.263947] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4671.264718] [ T46] ? __pfx_io_wq_worker+0x10/0x10 [ 4671.265529] [ T46] ret_from_fork+0x30/0x70 [ 4671.266277] [ T46] ? __pfx_io_wq_worker+0x10/0x10 [ 4671.267070] [ T46] ret_from_fork_asm+0x1a/0x30 [ 4671.267855] [ T46] </TASK> [ 4671.268495] [ T46] INFO: lockdep is turned off. [ 4794.088765] [ T46] INFO: task iou-wrk-131061:131062 blocked for more than 245 seconds. [ 4794.090310] [ T46] Not tainted 6.15.0-rc5 #317 [ 4794.091453] [ T46] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 4794.093379] [ T46] task:iou-wrk-131061 state:D stack:0 pid:131062 tgid:131061 ppid:131060 task_flags:0x404150 flags:0x00004000 [ 4794.095353] [ T46] Call Trace: [ 4794.096142] [ T46] <TASK> [ 4794.096929] [ T46] __schedule+0xf8c/0x5c80 [ 4794.097878] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.098869] [ T46] ? __mutex_lock+0x2d9/0x1b50 [ 4794.099866] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.100870] [ T46] ? lock_release+0x244/0x300 [ 4794.101857] [ T46] ? lock_release+0x244/0x300 [ 4794.102797] [ T46] ? __pfx___schedule+0x10/0x10 [ 4794.103726] [ T46] ? do_raw_spin_lock+0x124/0x260 [ 4794.104752] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.105744] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.106678] [ T46] ? lock_release+0x244/0x300 [ 4794.107575] [ T46] ? lock_release+0x244/0x300 [ 4794.108528] [ T46] schedule+0xda/0x390 [ 4794.109391] [ T46] blk_mq_freeze_queue_wait+0x121/0x170 [ 4794.110437] [ T46] ? __pfx_blk_mq_freeze_queue_wait+0x10/0x10 [ 4794.111527] [ T46] ? __pfx_autoremove_wake_function+0x10/0x10 [ 4794.112662] [ T46] ? trace_hardirqs_on+0x14/0x150 [ 4794.113626] [ T46] ? _raw_spin_unlock_irq+0x34/0x50 [ 4794.114644] [ T46] del_gendisk+0x4f9/0xa20 [ 4794.115518] [ T46] ? ublk_stop_dev+0x24/0x200 [ublk_drv] [ 4794.116548] [ T46] ? __pfx_del_gendisk+0x10/0x10 [ 4794.117536] [ T46] ? __pfx___mutex_lock+0x10/0x10 [ 4794.118521] [ T46] ublk_stop_dev_unlocked.part.0+0xb6/0x580 [ublk_drv] [ 4794.119749] [ T46] ? __pfx_ublk_stop_dev_unlocked.part.0+0x10/0x10 [ublk_drv] [ 4794.121060] [ T46] ? kvm_sched_clock_read+0xd/0x20 [ 4794.122081] [ T46] ublk_stop_dev+0x62/0x200 [ublk_drv] [ 4794.123102] [ T46] ? security_capable+0x79/0x110 [ 4794.124139] [ T46] ublk_ctrl_uring_cmd+0xe87/0x2a70 [ublk_drv] [ 4794.125225] [ T46] ? __pfx_avc_has_perm+0x10/0x10 [ 4794.126054] [ T46] ? lock_release+0x244/0x300 [ 4794.126850] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.127579] [ T46] ? __pfx_ublk_ctrl_uring_cmd+0x10/0x10 [ublk_drv] [ 4794.128512] [ T46] ? finish_task_switch.isra.0+0x202/0x850 [ 4794.129379] [ T46] ? selinux_uring_cmd+0x1c3/0x280 [ 4794.130178] [ T46] ? __schedule+0x2f21/0x5c80 [ 4794.130926] [ T46] ? __pfx_selinux_uring_cmd+0x10/0x10 [ 4794.131701] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.132357] [ T46] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 4794.133159] [ T46] io_uring_cmd+0x221/0x3c0 [ 4794.133931] [ T46] io_issue_sqe+0x6ab/0x1800 [ 4794.134653] [ T46] ? trace_hardirqs_on+0x14/0x150 [ 4794.135417] [ T46] ? __pfx_io_issue_sqe+0x10/0x10 [ 4794.136191] [ T46] ? __timer_delete_sync+0x16c/0x270 [ 4794.136984] [ T46] ? __pfx___timer_delete_sync+0x10/0x10 [ 4794.137833] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.138527] [ T46] io_wq_submit_work+0x3ea/0xf70 [ 4794.139268] [ T46] io_worker_handle_work+0x615/0x1280 [ 4794.140062] [ T46] io_wq_worker+0x2ee/0xbd0 [ 4794.140806] [ T46] ? __pfx_io_wq_worker+0x10/0x10 [ 4794.141478] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.142189] [ T46] ? lock_acquire+0x2af/0x310 [ 4794.142930] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.143663] [ T46] ? do_raw_spin_lock+0x124/0x260 [ 4794.144363] [ T46] ? lock_acquire+0x2af/0x310 [ 4794.145087] [ T46] ? __pfx_do_raw_spin_lock+0x10/0x10 [ 4794.145897] [ T46] ? lock_release+0x244/0x300 [ 4794.146617] [ T46] ? rcu_is_watching+0x11/0xb0 [ 4794.147340] [ T46] ? __pfx_io_wq_worker+0x10/0x10 [ 4794.148070] [ T46] ret_from_fork+0x30/0x70 [ 4794.148817] [ T46] ? __pfx_io_wq_worker+0x10/0x10 [ 4794.149576] [ T46] ret_from_fork_asm+0x1a/0x30 [ 4794.150307] [ T46] </TASK> [ 4794.150959] [ T46] INFO: lockdep is turned off. ...