On 9/2/25 8:29 AM, Yafang Shao wrote: > During recent testing with the netem qdisc to inject delays into TCP > traffic, we observed that our CLS BPF program failed to function correctly > due to incorrect classid retrieval from task_get_classid(). The issue > manifests in the following call stack: > > bpf_get_cgroup_classid+5 > cls_bpf_classify+507 > __tcf_classify+90 > tcf_classify+217 > __dev_queue_xmit+798 > bond_dev_queue_xmit+43 > __bond_start_xmit+211 > bond_start_xmit+70 > dev_hard_start_xmit+142 > sch_direct_xmit+161 > __qdisc_run+102 <<<<< Issue location > __dev_xmit_skb+1015 > __dev_queue_xmit+637 > neigh_hh_output+159 > ip_finish_output2+461 > __ip_finish_output+183 > ip_finish_output+41 > ip_output+120 > ip_local_out+94 > __ip_queue_xmit+394 > ip_queue_xmit+21 > __tcp_transmit_skb+2169 > tcp_write_xmit+959 > __tcp_push_pending_frames+55 > tcp_push+264 > tcp_sendmsg_locked+661 > tcp_sendmsg+45 > inet_sendmsg+67 > sock_sendmsg+98 > sock_write_iter+147 > vfs_write+786 > ksys_write+181 > __x64_sys_write+25 > do_syscall_64+56 > entry_SYSCALL_64_after_hwframe+100 > > The problem occurs when multiple tasks share a single qdisc. In such cases, > __qdisc_run() may transmit skbs created by different tasks. Consequently, > task_get_classid() retrieves an incorrect classid since it references the > current task's context rather than the skb's originating task. > > Given that dev_queue_xmit() always executes with bh disabled, we can use > softirq_count() instead to obtain the correct classid. > > The simple steps to reproduce this issue: > 1. Add network delay to the network interface: > such as: tc qdisc add dev bond0 root netem delay 1.5ms > 2. Build two distinct net_cls cgroups, each with a network-intensive task > 3. Initiate parallel TCP streams from both tasks to external servers. > > Under this specific condition, the issue reliably occurs. The kernel > eventually dequeues an SKB that originated from Task-A while executing in > the context of Task-B. > > It is worth noting that it will change the established behavior for a > slightly different scenario: > > <sock S is created by task A> > <class ID for task A is changed> > <skb is created by sock S xmit and classified> > > prior to this patch the skb will be classified with the 'new' task A > classid, now with the old/original one. The bpf_get_cgroup_classid_curr() > function is a more appropriate choice for this case. > > Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx> > Cc: Thomas Graf <tgraf@xxxxxxx> > Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> > Cc: Nikolay Aleksandrov <razor@xxxxxxxxxxxxx> @Daniel: I'm wondering if you have some specific use-case explicit leveraging the affected helper. If so, could you please test this change does not break it? Thanks, Paolo