On Mon, Jun 23, 2025 at 2:13 AM Changhui Zhong <czhong@xxxxxxxxxx> wrote: > > On Mon, Jun 23, 2025 at 12:02 PM Ming Lei <ming.lei@xxxxxxxxxx> wrote: > > > > Hi Changhui, > > > > On Mon, Jun 23, 2025 at 10:58:24AM +0800, Changhui Zhong wrote: > > > Hello, > > > > > > the following kernel panic was triggered by ubdsrv generic/002, > > > please help check and let me know if you need any info/test, thanks. > > > > > > commit HEAD: > > > > > > commit 2589cd05008205ee29f5f66f24a684732ee2e3a3 > > > Merge: 98d0347fe8fb e1c75831f682 > > > Author: Jens Axboe <axboe@xxxxxxxxx> > > > Date: Wed Jun 18 05:11:50 2025 -0600 > > > > > > Merge branch 'io_uring-6.16' into for-next > > > > > > * io_uring-6.16: > > > io_uring: fix potential page leak in io_sqe_buffer_register() > > > io_uring/sqpoll: don't put task_struct on tctx setup failure > > > io_uring: remove duplicate io_uring_alloc_task_context() definition > > > > The above branch has been merged to v6.16-rc3, can you reproduce it with -rc3? > > > > I tried to duplicate in my test VM, not succeed with -rc3. > > > > ... > > > > > [ 7044.064528] BUG: kernel NULL pointer dereference, address: 0000000000000001 > > > [ 7044.071507] #PF: supervisor read access in kernel mode > > > [ 7044.076653] #PF: error_code(0x0000) - not-present page > > > [ 7044.081801] PGD 462c42067 P4D 462c42067 PUD 462c43067 PMD 0 > > > [ 7044.087488] Oops: Oops: 0000 [#1] SMP NOPTI > > > [ 7044.091685] CPU: 13 UID: 0 PID: 367 Comm: kworker/13:1H Not tainted > > > 6.16.0-rc2+ #1 PREEMPT(voluntary) > > > [ 7044.100991] Hardware name: Dell Inc. PowerEdge R640/0X45NX, BIOS > > > 2.22.2 09/12/2024 > > > [ 7044.108565] Workqueue: kblockd blk_mq_requeue_work > > > [ 7044.113374] RIP: 0010:__io_req_task_work_add+0x18/0x1f0 > > > > Can you share where the above line points to source line if it can be > > reproduced in -rc3? > > > > gdb> l *(__io_req_task_work_add+0x18) > > > > > > Thanks, > > Ming > > > > now successfully reproduced on v6.16-rc3, more loop tests are needed > to trigger this issue, > > [ 8898.102836] BUG: kernel NULL pointer dereference, address: 0000000000000001 > [ 8898.109848] #PF: supervisor read access in kernel mode > [ 8898.115011] #PF: error_code(0x0000) - not-present page > [ 8898.120161] PGD 80000001bcd7b067 P4D 80000001bcd7b067 PUD 1ee49f067 PMD 0 > [ 8898.127043] Oops: Oops: 0000 [#1] SMP PTI > [ 8898.131065] CPU: 2 UID: 0 PID: 47056 Comm: kworker/2:2H Not tainted > 6.16.0-rc3 #1 PREEMPT(voluntary) > [ 8898.140283] Hardware name: Dell Inc. PowerEdge R340/045M96, BIOS > 2.17.3 09/12/2024 > [ 8898.147860] Workqueue: kblockd blk_mq_requeue_work > [ 8898.152658] RIP: 0010:__io_req_task_work_add+0x18/0x1f0 > [ 8898.157895] Code: 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 > 90 90 66 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 8b 6f 60 > 48 89 fb <f6> 45 01 20 0f 84 8e 00 00 00 31 c0 f6 47 48 0c 0f 94 c0 21 > c6 41 Disassembling this: 0: 41 56 push r14 2: 41 55 push r13 4: 41 54 push r12 6: 55 push rbp 7: 53 push rbx 8: 48 8b 6f 60 mov rbp,QWORD PTR [rdi+0x60] c: 48 89 fb mov rbx,rdi f: f6 45 01 20 test BYTE PTR [rbp+0x1],0x20 <--here 13: 0f 84 8e 00 00 00 je 0xa7 19: 31 c0 xor eax,eax 1b: f6 47 48 0c test BYTE PTR [rdi+0x48],0xc 1f: 0f 94 c0 sete al 22: 21 c6 and esi,eax So we look to be at the start of __io_req_task_work_add(). rdi stores req, rbp stores req->ctx, and so the test instruction that's faulting is loading (the second byte of) req->ctx->flags for the req->ctx->flags & IORING_SETUP_DEFER_TASKRUN check. This means req->ctx is NULL. Is it possible the req has already been completed or cancelled? The stacktrace shows that this is coming from blk_mq_requeue_work, which is definitely interesting. Best, Caleb > [ 8898.176650] RSP: 0018:ffffd28e08d03c50 EFLAGS: 00010206 > [ 8898.181882] RAX: ffffffffc0dc73d0 RBX: ffff8d64218c35c0 RCX: ffff8d676ee1e828 > [ 8898.189025] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8d64218c35c0 > [ 8898.196165] RBP: 0000000000000000 R08: 0000000000010000 R09: ffff8d6402d42600 > [ 8898.203308] R10: ffff8d6400c1d8c0 R11: fefefefefefefeff R12: ffff8d64218c35c0 > [ 8898.210448] R13: ffffd28e08d03cc8 R14: 0000000000000000 R15: ffff8d6420901310 > [ 8898.217592] FS: 0000000000000000(0000) GS:ffff8d67cd7c5000(0000) > knlGS:0000000000000000 > [ 8898.225685] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 8898.231441] CR2: 0000000000000001 CR3: 00000001951b8003 CR4: 00000000003726f0 > [ 8898.238581] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 8898.245720] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 8898.252876] Call Trace: > [ 8898.255335] <TASK> > [ 8898.257450] ublk_queue_rq+0x50/0x90 [ublk_drv] > [ 8898.261989] blk_mq_dispatch_rq_list+0x13c/0x510 > [ 8898.266620] __blk_mq_sched_dispatch_requests+0x118/0x1a0 > [ 8898.272027] ? xa_find_after+0xfc/0x190 > [ 8898.275876] blk_mq_sched_dispatch_requests+0x2d/0x70 > [ 8898.280937] blk_mq_run_hw_queue+0x26a/0x2e0 > [ 8898.285216] blk_mq_run_hw_queues+0x7f/0x140 > [ 8898.289498] blk_mq_requeue_work+0x19f/0x1e0 > [ 8898.293782] process_one_work+0x188/0x340 > [ 8898.297820] worker_thread+0x257/0x3a0 > [ 8898.301578] ? __pfx_worker_thread+0x10/0x10 > [ 8898.305871] kthread+0xf9/0x240 > [ 8898.309022] ? __pfx_kthread+0x10/0x10 > [ 8898.312785] ? __pfx_kthread+0x10/0x10 > [ 8898.316549] ret_from_fork+0xed/0x110 > [ 8898.320220] ? __pfx_kthread+0x10/0x10 > [ 8898.323981] ret_from_fork_asm+0x1a/0x30 > [ 8898.327919] </TASK> > [ 8898.330118] Modules linked in: ublk_drv rpcsec_gss_krb5 auth_rpcgss > nfsv4 dns_resolver nfs lockd grace nfs_localio netfs sunrpc ipmi_ssif > intel_rapl_msr intel_rapl_common intel_uncore_frequency > intel_uncore_frequency_common intel_pmc_core_pltdrv intel_pmc_core > pmt_telemetry pmt_class intel_pmc_ssram_telemetry intel_vsec > intel_tcc_cooling x86_pkg_temp_thermal intel_powerclamp coretemp > kvm_intel kvm platform_profile dell_wmi dell_smbios iTCO_wdt irqbypass > dell_wmi_descriptor iTCO_vendor_support rapl sparse_keymap rfkill > intel_cstate mgag200 tg3 mei_me dcdbas intel_uncore i2c_algo_bit > pcspkr mei i2c_i801 idma64 i2c_smbus ie31200_edac acpi_power_meter > intel_pch_thermal ipmi_si acpi_ipmi ipmi_devintf ipmi_msghandler sg > fuse loop dm_multipath nfnetlink xfs sd_mod ahci libahci megaraid_sas > libata ghash_clmulni_intel video pinctrl_cannonlake wmi dm_mirror > dm_region_hash dm_log dm_mod [last unloaded: ublk_drv] > [ 8898.409843] CR2: 0000000000000001 > [ 8898.413172] ---[ end trace 0000000000000000 ]--- > [ 8898.510831] pstore: backend (erst) writing error (-19) > [ 8898.515985] RIP: 0010:__io_req_task_work_add+0x18/0x1f0 > [ 8898.521221] Code: 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 > 90 90 66 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 8b 6f 60 > 48 89 fb <f6> 45 01 20 0f 84 8e 00 00 00 31 c0 f6 47 48 0c 0f 94 c0 21 > c6 41 > [ 8898.539975] RSP: 0018:ffffd28e08d03c50 EFLAGS: 00010206 > [ 8898.545208] RAX: ffffffffc0dc73d0 RBX: ffff8d64218c35c0 RCX: ffff8d676ee1e828 > [ 8898.552348] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8d64218c35c0 > [ 8898.559492] RBP: 0000000000000000 R08: 0000000000010000 R09: ffff8d6402d42600 > [ 8898.566631] R10: ffff8d6400c1d8c0 R11: fefefefefefefeff R12: ffff8d64218c35c0 > [ 8898.573775] R13: ffffd28e08d03cc8 R14: 0000000000000000 R15: ffff8d6420901310 > [ 8898.580913] FS: 0000000000000000(0000) GS:ffff8d67cd7c5000(0000) > knlGS:0000000000000000 > [ 8898.589011] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 8898.594763] CR2: 0000000000000001 CR3: 00000001951b8003 CR4: 00000000003726f0 > [ 8898.601906] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 8898.609047] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 8898.616191] Kernel panic - not syncing: Fatal exception > [ 8898.621466] Kernel Offset: 0x1dc00000 from 0xffffffff81000000 > (relocation range: 0xffffffff80000000-0xffffffffbfffffff) > [ 8898.646077] ---[ end Kernel panic - not syncing: Fatal exception ]--- > > > (gdb) l *(__io_req_task_work_add+0x18) > 0xffffffff81907668 is in __io_req_task_work_add (io_uring/io_uring.c:1251). > 1246 io_fallback_tw(tctx, false); > 1247 } > 1248 > 1249 void __io_req_task_work_add(struct io_kiocb *req, unsigned flags) > 1250 { > 1251 if (req->ctx->flags & IORING_SETUP_DEFER_TASKRUN) > 1252 io_req_local_work_add(req, flags); > 1253 else > 1254 io_req_normal_work_add(req); > 1255 } > (gdb) > > > Thanks, > Changhui > >