On Wed, Sep 10, 2025 at 7:05 AM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 9/9/25 11:56 PM, Kuniyuki Iwashima wrote: > > On Tue, Sep 9, 2025 at 10:15 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > >> > >> On 9/9/25 4:26 PM, Kuniyuki Iwashima wrote: > >>> syzbot reported the splat below. [0] > >>> > >>> The repro does the following: > >>> > >>> 1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes) > >>> 2. Attach the prog to a SOCKMAP > >>> 3. Add a socket to the SOCKMAP > >>> 4. Activate fault injection > >>> 5. Send data less than cork_bytes > >>> > >>> At 5., the data is carried over to the next sendmsg() as it is > >>> smaller than the cork_bytes specified by bpf_msg_cork_bytes(). > >>> > >>> Then, tcp_bpf_send_verdict() tries to allocate psock->cork to hold > >>> the data, but this fails silently due to fault injection + __GFP_NOWARN. > >>> > >>> If the allocation fails, we need to revert the sk->sk_forward_alloc > >>> change done by sk_msg_alloc(). > >>> > >>> Let's call sk_msg_free() when tcp_bpf_send_verdict fails to allocate > >>> psock->cork. > >>> > >>> [0]: > >>> WARNING: net/ipv4/af_inet.c:156 at inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156, CPU#1: syz-executor/5983 > >>> Modules linked in: > >>> CPU: 1 UID: 0 PID: 5983 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full) > >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 > >>> RIP: 0010:inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156 > >>> Code: 0f 0b 90 e9 62 fe ff ff e8 7a db b5 f7 90 0f 0b 90 e9 95 fe ff ff e8 6c db b5 f7 90 0f 0b 90 e9 bb fe ff ff e8 5e db b5 f7 90 <0f> 0b 90 e9 e1 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 9f fc > >>> RSP: 0018:ffffc90000a08b48 EFLAGS: 00010246 > >>> RAX: ffffffff8a09d0b2 RBX: dffffc0000000000 RCX: ffff888024a23c80 > >>> RDX: 0000000000000100 RSI: 0000000000000fff RDI: 0000000000000000 > >>> RBP: 0000000000000fff R08: ffff88807e07c627 R09: 1ffff1100fc0f8c4 > >>> R10: dffffc0000000000 R11: ffffed100fc0f8c5 R12: ffff88807e07c380 > >>> R13: dffffc0000000000 R14: ffff88807e07c60c R15: 1ffff1100fc0f872 > >>> FS: 00005555604c4500(0000) GS:ffff888125af1000(0000) knlGS:0000000000000000 > >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>> CR2: 00005555604df5c8 CR3: 0000000032b06000 CR4: 00000000003526f0 > >>> Call Trace: > >>> <IRQ> > >>> __sk_destruct+0x86/0x660 net/core/sock.c:2339 > >>> rcu_do_batch kernel/rcu/tree.c:2605 [inline] > >>> rcu_core+0xca8/0x1770 kernel/rcu/tree.c:2861 > >>> handle_softirqs+0x286/0x870 kernel/softirq.c:579 > >>> __do_softirq kernel/softirq.c:613 [inline] > >>> invoke_softirq kernel/softirq.c:453 [inline] > >>> __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680 > >>> irq_exit_rcu+0x9/0x30 kernel/softirq.c:696 > >>> instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline] > >>> sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052 > >>> </IRQ> > >>> > >>> Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data") > >>> Reported-by: syzbot+4cabd1d2fa917a456db8@xxxxxxxxxxxxxxxxxxxxxxxxx > >>> Closes: https://lore.kernel.org/netdev/68c0b6b5.050a0220.3c6139.0013.GAE@xxxxxxxxxx/ > >>> Signed-off-by: Kuniyuki Iwashima <kuniyu@xxxxxxxxxx> > >>> --- > >>> net/ipv4/tcp_bpf.c | 4 +++- > >>> 1 file changed, 3 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c > >>> index ba581785adb4..ee6a371e65a4 100644 > >>> --- a/net/ipv4/tcp_bpf.c > >>> +++ b/net/ipv4/tcp_bpf.c > >>> @@ -408,8 +408,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock, > >>> if (!psock->cork) { > >>> psock->cork = kzalloc(sizeof(*psock->cork), > >>> GFP_ATOMIC | __GFP_NOWARN); > >>> - if (!psock->cork) > >>> + if (!psock->cork) { > >>> + sk_msg_free(sk, msg); > >> > >> Nothing has been corked yet, does it need to update the "*copied": > >> > >> *copied -= sk_msg_free(sk, msg); > > > > Oh exactly, or simply *copied = 0 ? > > Make sense. I made the change and updated the commit message for this fix also. > Applied. Thanks. Thank you Martin!