在 2025/09/08 17:07, Eric Dumazet 写道: > On Mon, Sep 8, 2025 at 1:52 AM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: >> 在 2025/09/06 17:16, Eric Dumazet 写道: >>> On Fri, Sep 5, 2025 at 1:03 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote: >>>> On Fri, Sep 5, 2025 at 1:00 PM syzbot >>>> <syzbot+e1cd6bd8493060bd701d@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: >>> >>> Note to NBD maintainers : I held about 20 syzbot reports all pointing >>> to NBD accepting various sockets, I can release them if needed, if you prefer >>> to triage them. >>> >> I'm not NBD maintainer, just trying to understand the deadlock first. >> >> Is this deadlock only possible for some sepecific socket types? Take >> a look at the report here: >> >> Usually issue IO will require the order: >> >> q_usage_counter -> cmd lock -> tx lock -> sk lock >> > > I have not seen the deadlock being reported with normal TCP sockets. > > NBD sets sk->sk_allocation to GFP_NOIO | __GFP_MEMALLOC; > from __sock_xmit(), and TCP seems to respect this. > Only if ffa1e7ada45 is missed, given the __correct__ locking order enforced in ffa1e7ada45 ("block: Make request_queue lockdep splats show up earlier"), GFP_NOIO does not help to cure any case that reverses that order, while __GFP_MEMALLOC looks like a paperover, at least because __GFP_MEMALLOC does not match lock_sock(). -> #0 (sk_lock-AF_INET6){+.+.}-{0:0}: check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3908 __lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5237 lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868 lock_sock_nested+0x48/0x100 net/core/sock.c:3733 lock_sock include/net/sock.h:1667 [inline] inet_shutdown+0x6a/0x390 net/ipv4/af_inet.c:905 nbd_mark_nsock_dead+0x2e9/0x560 drivers/block/nbd.c:318 nbd_send_cmd+0x11ec/0x1ba0 drivers/block/nbd.c:799 nbd_handle_cmd drivers/block/nbd.c:1174 [inline] nbd_queue_rq+0xcdb/0xf10 drivers/block/nbd.c:1204