Begin forwarded message: Date: Sun, 07 Sep 2025 03:42:22 +0000 From: bugzilla-daemon@xxxxxxxxxx To: stephen@xxxxxxxxxxxxxxxxxx Subject: [Bug 220544] New: AF_SMC deadlock: held by __sock_release, smc_release, and __flush_work https://bugzilla.kernel.org/show_bug.cgi?id=220544 Bug ID: 220544 Summary: AF_SMC deadlock: held by __sock_release, smc_release, and __flush_work Product: Networking Version: 2.5 Kernel Version: 6.12.x Hardware: All OS: Linux Status: NEW Severity: normal Priority: P3 Component: Other Assignee: stephen@xxxxxxxxxxxxxxxxxx Reporter: hi@xxxxxxxxxxx Regression: No Created attachment 308627 --> https://bugzilla.kernel.org/attachment.cgi?id=308627&action=edit crash full log and program source code [ 2499.781797] [ 2499.782400] ====================================================== [ 2499.784129] WARNING: possible circular locking dependency detected [ 2499.785824] 6.12.42 #1 Not tainted [ 2499.786843] ------------------------------------------------------ [ 2499.788589] 1296/22742 is trying to acquire lock: [ 2499.789941] ffff88801776ec18 ((work_completion)(&new_smc->smc_listen_work)){+.+.}-{0:0}, at: __flush_work+0x514/0xd50 [ 2499.793080] [ 2499.793080] but task is already holding lock: [ 2499.794731] ffff888017768e98 (sk_lock-AF_SMC/1){+.+.}-{0:0}, at: smc_release+0x376/0x600 [ 2499.797004] [ 2499.797004] which lock already depends on the new lock. [ 2499.797004] [ 2499.799295] [ 2499.799295] the existing dependency chain (in reverse order) is: [ 2499.801365] [ 2499.801365] -> #1 (sk_lock-AF_SMC/1){+.+.}-{0:0}: [ 2499.803149] lock_sock_nested+0x3a/0x100 [ 2499.804427] smc_listen_out+0x1ea/0x4c0 [ 2499.805686] smc_listen_work+0x4d1/0x5520 [ 2499.806987] process_one_work+0x94a/0x1740 [ 2499.808415] worker_thread+0x5c4/0xe10 [ 2499.809650] kthread+0x2ad/0x360 [ 2499.810763] ret_from_fork+0x4e/0x80 [ 2499.811966] ret_from_fork_asm+0x1a/0x30 [ 2499.813324] [ 2499.813324] -> #0 ((work_completion)(&new_smc->smc_listen_work)){+.+.}-{0:0}: [ 2499.815691] __lock_acquire+0x2413/0x4310 [ 2499.816983] lock_acquire.part.0+0xff/0x350 [ 2499.818259] __flush_work+0x528/0xd50 [ 2499.819376] __cancel_work_sync+0x105/0x130 [ 2499.820689] smc_clcsock_release+0x61/0xf0 [ 2499.821958] __smc_release+0x5c9/0x8a0 [ 2499.823163] smc_close_non_accepted+0xd7/0x210 [ 2499.824602] smc_close_active+0x535/0x10e0 [ 2499.825867] __smc_release+0x643/0x8a0 [ 2499.827067] smc_release+0x1f0/0x600 [ 2499.828197] __sock_release+0xac/0x260 [ 2499.829427] sock_close+0x1c/0x30 [ 2499.830506] __fput+0x3f6/0xb40 [ 2499.831552] __fput_sync+0x4a/0x60 [ 2499.832651] __x64_sys_close+0x86/0x100 [ 2499.833855] do_syscall_64+0xbb/0x1d0 [ 2499.835043] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 2499.836574] [ 2499.836574] other info that might help us debug this: [ 2499.836574] [ 2499.838725] Possible unsafe locking scenario: [ 2499.838725] [ 2499.840491] CPU0 CPU1 [ 2499.841748] ---- ---- [ 2499.843036] lock(sk_lock-AF_SMC/1); [ 2499.844110] lock((work_completion)(&new_smc->smc_listen_work)); [ 2499.846436] lock(sk_lock-AF_SMC/1); [ 2499.848134] lock((work_completion)(&new_smc->smc_listen_work)); [ 2499.849780] [ 2499.849780] *** DEADLOCK *** [ 2499.849780] [ 2499.851388] 3 locks held by 1296/22742: [ 2499.852456] #0: ffff88801ed58d88 (&sb->s_type->i_mutex_key#12){+.+.}-{3:3}, at: __sock_release+0x81/0x260 [ 2499.855185] #1: ffff888017768e98 (sk_lock-AF_SMC/1){+.+.}-{0:0}, at: smc_release+0x376/0x600 [ 2499.857486] #2: ffffffff86e9dc00 (rcu_read_lock){....}-{1:2}, at: __flush_work+0xff/0xd50 [ 2499.859710] [ 2499.859710] stack backtrace: [ 2499.860913] CPU: 0 UID: 0 PID: 22742 Comm: 1296 Not tainted 6.12.42 #1 [ 2499.860935] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 2499.860944] Call Trace: [ 2499.860951] <TASK> [ 2499.860959] dump_stack_lvl+0xba/0x110 [ 2499.860983] print_circular_bug.cold+0x1e8/0x27f [ 2499.861041] check_noncircular+0x30e/0x3c0 [ 2499.861064] ? __pfx_check_noncircular+0x10/0x10 [ 2499.861084] ? register_lock_class+0xb2/0x12e0 [ 2499.861112] ? lockdep_lock+0xb5/0x1b0 [ 2499.861131] ? __pfx_lockdep_lock+0x10/0x10 [ 2499.861151] __lock_acquire+0x2413/0x4310 [ 2499.861177] ? __pfx___lock_acquire+0x10/0x10 [ 2499.861199] ? __pfx_mark_lock+0x10/0x10 [ 2499.861221] ? __flush_work+0x514/0xd50 [ 2499.861240] lock_acquire.part.0+0xff/0x350 [ 2499.861261] ? __flush_work+0x514/0xd50 [ 2499.861280] ? lock_release+0x209/0x7d0 [ 2499.861302] ? __pfx_lock_acquire.part.0+0x10/0x10 [ 2499.861323] ? __flush_work+0x514/0xd50 [ 2499.861342] ? trace_lock_acquire+0x132/0x1c0 [ 2499.861360] ? __flush_work+0x514/0xd50 [ 2499.861378] ? lock_acquire+0x31/0xc0 [ 2499.861398] ? __flush_work+0x514/0xd50 [ 2499.861418] __flush_work+0x528/0xd50 [ 2499.861436] ? __flush_work+0x514/0xd50 [ 2499.861456] ? __pfx___flush_work+0x10/0x10 [ 2499.861475] ? __pfx_sock_def_readable+0x10/0x10 [ 2499.861497] ? trace_irq_disable.constprop.0+0xcd/0x110 [ 2499.861519] ? __pfx_wq_barrier_func+0x10/0x10 [ 2499.861548] ? __pfx___might_resched+0x10/0x10 [ 2499.861567] ? __pfx_sock_def_readable+0x10/0x10 [ 2499.861587] __cancel_work_sync+0x105/0x130 [ 2499.861609] smc_clcsock_release+0x61/0xf0 [ 2499.861630] ? __local_bh_enable_ip+0x9b/0x140 [ 2499.861646] __smc_release+0x5c9/0x8a0 [ 2499.861665] ? lockdep_hardirqs_on_prepare+0x201/0x400 [ 2499.861688] ? __pfx_sock_def_readable+0x10/0x10 [ 2499.861708] smc_close_non_accepted+0xd7/0x210 [ 2499.861730] smc_close_active+0x535/0x10e0 [ 2499.861753] __smc_release+0x643/0x8a0 [ 2499.861772] ? lockdep_hardirqs_on_prepare+0x25c/0x400 [ 2499.861795] smc_release+0x1f0/0x600 [ 2499.861814] __sock_release+0xac/0x260 [ 2499.861840] ? __pfx_sock_close+0x10/0x10 [ 2499.861864] sock_close+0x1c/0x30 [ 2499.861886] __fput+0x3f6/0xb40 [ 2499.861912] __fput_sync+0x4a/0x60 [ 2499.861935] __x64_sys_close+0x86/0x100 [ 2499.861950] do_syscall_64+0xbb/0x1d0 [ 2499.861972] entry_SYSCALL_64_after_hwframe+0x77/0x7f [ 2499.861992] RIP: 0033:0x7f40854559a0 [ 2499.862033] Code: 0d 00 00 00 eb b2 e8 0f f8 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 80 3d 41 1c 0e 00 00 74 17 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c [ 2499.862049] RSP: 002b:00007ffecd7bbad8 EFLAGS: 00000202 ORIG_RAX: 0000000000000003 [ 2499.862065] RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00007f40854559a0 [ 2499.862077] RDX: 0000000000000000 RSI: 000055df79c1fe38 RDI: 0000000000000005 [ 2499.862087] RBP: 0000000000000006 R08: 000000000000f800 R09: 0000000000000073 [ 2499.862098] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffecd7bbb80 [ 2499.862110] R13: 00007ffecd7bbdb8 R14: 000055df79c21dd8 R15: 0000000000000000 [ 2499.862128] </TASK> Crashes happened on 6.12.34 and 6.12.42. Machine info: QEMU X86_64 Linux version 6.12.42(gcc (GCC) 15.1.1 20250729, GNU ld (GNU Binutils) 2.45.0) #1 SMP PREEMPT_DYNAMIC Tue Aug 19 21:04:29 EDT 2025 Command line: console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0 nokaslr infiniband enabled through rxe Programs and logs that trigger the bug are attached Usage `cat crash.input | program` -- You may reply to this email to add a comment. You are receiving this mail because: You are the assignee for the bug.