Hi all, I ran the latest blktests (git hash: 9a9a365f389f) with the v6.15-rc1 kernel. I observed 4 failures listed below. Comparing with the previous report with the v6.14-rc1 kernel [1], 3 failures are no longer observed (block/002, nvme/058 and zbd/009), and 2 new failures are observed (rxe driver hang, q_usage_counter WARN during system boot). [1] https://lore.kernel.org/linux-block/uyijd3ufbrfbiyyaajvhyhdyytssubekvymzgyiqjqmkh33cmi@ksqjpewsqlvw/ List of failures ================ #1: rxe driver test hangs (new: nvme rdma transport, rnbd, srp) #2: nvme/037 (fc transport) #3: nvme/041 (fc transport) #4: q_usage_counter WARN during system boot (new) Failure description =================== #1: rxe driver test hangs (new: nvme rdma transport, rnbd, srp) When use_rxe=1 is set in blktests config, blktests runs trigger null pointer dereferences and hang at test cases which use rdma transport. The hang is observed at nvme, rnbd and srp test groups. Li Zhijian provided the fix patch (thanks!) [2], and it is already applied for v6.15-rcX releases. [2] https://lore.kernel.org/linux-rdma/20250402032657.1762800-1-lizhijian@xxxxxxxxxxx/ #2: nvme/037 (fc transport) #3: nvme/041 (fc transport) These two test cases fail for fc transport. Refer to the report for v6.12 kernel [3]. [3] https://lore.kernel.org/linux-nvme/6crydkodszx5vq4ieox3jjpwkxtu7mhbohypy24awlo5w7f4k6@to3dcng24rd4/ #4: q_usage_counter WARN during system boot (new) To be precise, this is not a blktests failure, but I found it on my test systems for blktests. During the system boot process, a lockdep WARN was reported which is relevant to q_usage_counter [4]. I think this needs further debug. [4] Apr 10 10:24:19 testnode2 kernel: ====================================================== Apr 10 10:24:19 testnode2 kernel: WARNING: possible circular locking dependency detected Apr 10 10:24:19 testnode2 kernel: 6.15.0-rc1+ #22 Not tainted Apr 10 10:24:19 testnode2 kernel: ------------------------------------------------------ Apr 10 10:24:19 testnode2 kernel: (udev-worker)/556 is trying to acquire lock: Apr 10 10:24:19 testnode2 kernel: ffff8881241bb698 (&q->elevator_lock){+.+.}-{4:4}, at: elv_iosched_store+0x178/0x4d0 Apr 10 10:24:19 testnode2 kernel: but task is already holding lock: Apr 10 10:24:19 testnode2 kernel: ffff8881241bb168 (&q->q_usage_counter(io)#12){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0xe/0x20 Apr 10 10:24:19 testnode2 kernel: which lock already depends on the new lock. Apr 10 10:24:19 testnode2 kernel: the existing dependency chain (in reverse order) is: Apr 10 10:24:19 testnode2 kernel: -> #2 (&q->q_usage_counter(io)#12){++++}-{0:0}: Apr 10 10:24:19 testnode2 kernel: blk_alloc_queue+0x5bc/0x700 Apr 10 10:24:19 testnode2 kernel: blk_mq_alloc_queue+0x14e/0x230 Apr 10 10:24:19 testnode2 kernel: scsi_alloc_sdev+0x848/0xc70 Apr 10 10:24:19 testnode2 kernel: scsi_probe_and_add_lun+0x475/0xb60 Apr 10 10:24:19 testnode2 kernel: __scsi_scan_target+0x18d/0x3b0 Apr 10 10:24:19 testnode2 kernel: scsi_scan_channel+0xfa/0x1a0 Apr 10 10:24:19 testnode2 kernel: scsi_scan_host_selected+0x1f8/0x2b0 Apr 10 10:24:19 testnode2 kernel: scsi_scan_host+0x2e2/0x3a0 Apr 10 10:24:19 testnode2 kernel: sym2_probe.cold+0xd7f/0xddc [sym53c8xx] Apr 10 10:24:19 testnode2 kernel: local_pci_probe+0xdb/0x180 Apr 10 10:24:19 testnode2 kernel: pci_device_probe+0x588/0x7d0 Apr 10 10:24:19 testnode2 kernel: really_probe+0x1e3/0x8a0 Apr 10 10:24:19 testnode2 kernel: __driver_probe_device+0x18c/0x370 Apr 10 10:24:19 testnode2 kernel: driver_probe_device+0x4a/0x120 Apr 10 10:24:19 testnode2 kernel: __driver_attach+0x190/0x4a0 Apr 10 10:24:19 testnode2 kernel: bus_for_each_dev+0x104/0x180 Apr 10 10:24:19 testnode2 kernel: bus_add_driver+0x29d/0x4d0 Apr 10 10:24:19 testnode2 kernel: driver_register+0x1a1/0x350 Apr 10 10:24:19 testnode2 kernel: vsock_loopback_send_pkt+0xd4/0x120 [vsock_loopback] Apr 10 10:24:19 testnode2 kernel: do_one_initcall+0xd1/0x450 Apr 10 10:24:19 testnode2 kernel: do_init_module+0x238/0x740 Apr 10 10:24:19 testnode2 kernel: load_module+0x52a3/0x73b0 Apr 10 10:24:19 testnode2 kernel: init_module_from_file+0xe5/0x150 Apr 10 10:24:19 testnode2 kernel: idempotent_init_module+0x22d/0x770 Apr 10 10:24:19 testnode2 kernel: __x64_sys_finit_module+0xbe/0x140 Apr 10 10:24:19 testnode2 kernel: do_syscall_64+0x93/0x190 Apr 10 10:24:19 testnode2 kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e Apr 10 10:24:19 testnode2 kernel: -> #1 (fs_reclaim){+.+.}-{0:0}: Apr 10 10:24:19 testnode2 kernel: fs_reclaim_acquire+0xc5/0x100 Apr 10 10:24:19 testnode2 kernel: kmem_cache_alloc_noprof+0x4d/0x430 Apr 10 10:24:19 testnode2 kernel: __kernfs_new_node+0xcb/0x750 Apr 10 10:24:19 testnode2 kernel: kernfs_new_node+0xeb/0x1b0 Apr 10 10:24:19 testnode2 kernel: kernfs_create_dir_ns+0x27/0x140 Apr 10 10:24:19 testnode2 kernel: sysfs_create_dir_ns+0x12c/0x270 Apr 10 10:24:19 testnode2 kernel: kobject_add_internal+0x272/0x800 Apr 10 10:24:19 testnode2 kernel: kobject_add+0x131/0x1a0 Apr 10 10:24:19 testnode2 kernel: elv_register_queue+0xb7/0x220 Apr 10 10:24:19 testnode2 kernel: blk_register_queue+0x31a/0x470 Apr 10 10:24:19 testnode2 kernel: add_disk_fwnode+0x712/0x10a0 Apr 10 10:24:19 testnode2 kernel: sd_probe+0x94e/0xf30 Apr 10 10:24:19 testnode2 kernel: really_probe+0x1e3/0x8a0 Apr 10 10:24:19 testnode2 kernel: __driver_probe_device+0x18c/0x370 Apr 10 10:24:19 testnode2 kernel: driver_probe_device+0x4a/0x120 Apr 10 10:24:19 testnode2 kernel: __device_attach_driver+0x15e/0x270 Apr 10 10:24:19 testnode2 kernel: bus_for_each_drv+0x113/0x1a0 Apr 10 10:24:19 testnode2 kernel: __device_attach_async_helper+0x19a/0x240 Apr 10 10:24:19 testnode2 kernel: zram0: detected capacity change from 0 to 16777216 Apr 10 10:24:19 testnode2 kernel: async_run_entry_fn+0x96/0x4f0 Apr 10 10:24:19 testnode2 kernel: process_one_work+0x84f/0x1460 Apr 10 10:24:19 testnode2 kernel: worker_thread+0x5ef/0xfd0 Apr 10 10:24:19 testnode2 kernel: kthread+0x3b0/0x770 Apr 10 10:24:19 testnode2 kernel: ret_from_fork+0x30/0x70 Apr 10 10:24:19 testnode2 kernel: ret_from_fork_asm+0x1a/0x30 Apr 10 10:24:19 testnode2 kernel: -> #0 (&q->elevator_lock){+.+.}-{4:4}: Apr 10 10:24:19 testnode2 kernel: __lock_acquire+0x1405/0x2210 Apr 10 10:24:19 testnode2 kernel: lock_acquire+0x170/0x310 Apr 10 10:24:19 testnode2 kernel: __mutex_lock+0x1a9/0x1a40 Apr 10 10:24:19 testnode2 kernel: elv_iosched_store+0x178/0x4d0 Apr 10 10:24:19 testnode2 kernel: queue_attr_store+0x236/0x2f0 Apr 10 10:24:19 testnode2 kernel: kernfs_fop_write_iter+0x39f/0x5a0 Apr 10 10:24:19 testnode2 kernel: vfs_write+0x5f7/0xe90 Apr 10 10:24:19 testnode2 kernel: ksys_write+0xf5/0x1c0 Apr 10 10:24:19 testnode2 kernel: do_syscall_64+0x93/0x190 Apr 10 10:24:19 testnode2 kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e Apr 10 10:24:19 testnode2 kernel: other info that might help us debug this: Apr 10 10:24:19 testnode2 kernel: Chain exists of: &q->elevator_lock --> fs_reclaim --> &q->q_usage_counter(io)#12 Apr 10 10:24:19 testnode2 kernel: Possible unsafe locking scenario: Apr 10 10:24:19 testnode2 kernel: CPU0 CPU1 Apr 10 10:24:19 testnode2 kernel: ---- ---- Apr 10 10:24:19 testnode2 kernel: lock(&q->q_usage_counter(io)#12); Apr 10 10:24:19 testnode2 kernel: lock(fs_reclaim); Apr 10 10:24:19 testnode2 kernel: lock(&q->q_usage_counter(io)#12); Apr 10 10:24:19 testnode2 kernel: lock(&q->elevator_lock); Apr 10 10:24:19 testnode2 kernel: *** DEADLOCK *** Apr 10 10:24:19 testnode2 kernel: 5 locks held by (udev-worker)/556: Apr 10 10:24:19 testnode2 kernel: #0: ffff888120de0420 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0xf5/0x1c0 Apr 10 10:24:19 testnode2 kernel: #1: ffff888130a62c88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x25c/0x5a0 Apr 10 10:24:19 testnode2 kernel: #2: ffff88812d38ec38 (kn->active#72){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x27f/0x5a0 Apr 10 10:24:19 testnode2 kernel: #3: ffff8881241bb168 (&q->q_usage_counter(io)#12){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0xe/0x20 Apr 10 10:24:19 testnode2 kernel: #4: ffff8881241bb1a0 (&q->q_usage_counter(queue)#5){++++}-{0:0}, at: blk_mq_freeze_queue_nomemsave+0xe/0x20 Apr 10 10:24:19 testnode2 kernel: stack backtrace: Apr 10 10:24:19 testnode2 kernel: CPU: 0 UID: 0 PID: 556 Comm: (udev-worker) Not tainted 6.15.0-rc1+ #22 PREEMPT(voluntary) Apr 10 10:24:19 testnode2 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 Apr 10 10:24:19 testnode2 kernel: Call Trace: Apr 10 10:24:19 testnode2 kernel: <TASK> Apr 10 10:24:19 testnode2 kernel: dump_stack_lvl+0x6a/0x90 Apr 10 10:24:19 testnode2 kernel: print_circular_bug.cold+0x178/0x1be Apr 10 10:24:19 testnode2 kernel: check_noncircular+0x146/0x160 Apr 10 10:24:19 testnode2 kernel: __lock_acquire+0x1405/0x2210 Apr 10 10:24:19 testnode2 kernel: lock_acquire+0x170/0x310 Apr 10 10:24:19 testnode2 kernel: ? elv_iosched_store+0x178/0x4d0 Apr 10 10:24:19 testnode2 kernel: ? __pfx___might_resched+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: __mutex_lock+0x1a9/0x1a40 Apr 10 10:24:19 testnode2 kernel: ? elv_iosched_store+0x178/0x4d0 Apr 10 10:24:19 testnode2 kernel: ? elv_iosched_store+0x178/0x4d0 Apr 10 10:24:19 testnode2 kernel: ? mark_held_locks+0x40/0x70 Apr 10 10:24:19 testnode2 kernel: ? _raw_spin_unlock_irqrestore+0x4c/0x60 Apr 10 10:24:19 testnode2 kernel: ? __pfx___mutex_lock+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? _raw_spin_unlock_irqrestore+0x35/0x60 Apr 10 10:24:19 testnode2 kernel: ? blk_mq_freeze_queue_wait+0x15a/0x170 Apr 10 10:24:19 testnode2 kernel: ? __pfx_autoremove_wake_function+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? elv_iosched_store+0x178/0x4d0 Apr 10 10:24:19 testnode2 kernel: elv_iosched_store+0x178/0x4d0 Apr 10 10:24:19 testnode2 kernel: ? lock_acquire+0x170/0x310 Apr 10 10:24:19 testnode2 kernel: ? __pfx_elv_iosched_store+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? __pfx___might_resched+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? __pfx_sysfs_kf_write+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: queue_attr_store+0x236/0x2f0 Apr 10 10:24:19 testnode2 kernel: ? __pfx_queue_attr_store+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? find_held_lock+0x2b/0x80 Apr 10 10:24:19 testnode2 kernel: ? sysfs_file_kobj+0xb3/0x1c0 Apr 10 10:24:19 testnode2 kernel: ? sysfs_file_kobj+0xb3/0x1c0 Apr 10 10:24:19 testnode2 kernel: ? lock_release+0x17d/0x2c0 Apr 10 10:24:19 testnode2 kernel: ? lock_is_held_type+0xd5/0x130 Apr 10 10:24:19 testnode2 kernel: ? __pfx_sysfs_kf_write+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? sysfs_file_kobj+0xbd/0x1c0 Apr 10 10:24:19 testnode2 kernel: ? __pfx_sysfs_kf_write+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: kernfs_fop_write_iter+0x39f/0x5a0 Apr 10 10:24:19 testnode2 kernel: vfs_write+0x5f7/0xe90 Apr 10 10:24:19 testnode2 kernel: ? seqcount_lockdep_reader_access.constprop.0+0x82/0x90 Apr 10 10:24:19 testnode2 kernel: ? __pfx_vfs_write+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? __lock_acquire+0x45d/0x2210 Apr 10 10:24:19 testnode2 kernel: ? find_held_lock+0x2b/0x80 Apr 10 10:24:19 testnode2 kernel: ? ktime_get_coarse_real_ts64+0x3d/0xd0 Apr 10 10:24:19 testnode2 kernel: ? ktime_get_coarse_real_ts64+0x3d/0xd0 Apr 10 10:24:19 testnode2 kernel: ksys_write+0xf5/0x1c0 Apr 10 10:24:19 testnode2 kernel: ? __pfx_ksys_write+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? ktime_get_coarse_real_ts64+0x3d/0xd0 Apr 10 10:24:19 testnode2 kernel: do_syscall_64+0x93/0x190 Apr 10 10:24:19 testnode2 kernel: ? __pfx___might_resched+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? find_held_lock+0x2b/0x80 Apr 10 10:24:19 testnode2 kernel: ? __might_fault+0x99/0x120 Apr 10 10:24:19 testnode2 kernel: ? __might_fault+0x99/0x120 Apr 10 10:24:19 testnode2 kernel: ? lock_release+0x17d/0x2c0 Apr 10 10:24:19 testnode2 kernel: ? rcu_is_watching+0x11/0xb0 Apr 10 10:24:19 testnode2 kernel: ? __rseq_handle_notify_resume+0x3f7/0xc00 Apr 10 10:24:19 testnode2 kernel: ? __pfx___rseq_handle_notify_resume+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? __pfx_blkcg_maybe_throttle_current+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? __pfx___x64_sys_openat+0x10/0x10 Apr 10 10:24:19 testnode2 kernel: ? syscall_exit_to_user_mode+0x8e/0x280 Apr 10 10:24:19 testnode2 kernel: ? rcu_is_watching+0x11/0xb0 Apr 10 10:24:19 testnode2 kernel: ? do_syscall_64+0x9f/0x190 Apr 10 10:24:19 testnode2 kernel: ? lockdep_hardirqs_on+0x78/0x100 Apr 10 10:24:19 testnode2 kernel: ? do_syscall_64+0x9f/0x190 Apr 10 10:24:19 testnode2 kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e Apr 10 10:24:19 testnode2 kernel: RIP: 0033:0x7f8784cf4f44 Apr 10 10:24:19 testnode2 kernel: Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 85 91 10 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 55 48 89 e5 48 83 ec 20 48 89 Apr 10 10:24:19 testnode2 kernel: RSP: 002b:00007fffc374d1a8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 Apr 10 10:24:19 testnode2 kernel: RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f8784cf4f44 Apr 10 10:24:19 testnode2 kernel: RDX: 0000000000000003 RSI: 00007fffc374d4b0 RDI: 000000000000001b Apr 10 10:24:19 testnode2 kernel: RBP: 00007fffc374d1d0 R08: 00007f8784df51c8 R09: 00007fffc374d280 Apr 10 10:24:19 testnode2 kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003 Apr 10 10:24:19 testnode2 kernel: R13: 00007fffc374d4b0 R14: 000055941522b580 R15: 00007f8784df4e80 Apr 10 10:24:19 testnode2 kernel: </TASK>