Re: [PATCH bpf-next V2 0/7] xdp: Allow BPF to set RX hints for XDP_REDIRECTed packets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 29/07/2025 21.47, Martin KaFai Lau wrote:
On 7/29/25 4:15 AM, Jesper Dangaard Brouer wrote:
That idea has been considered before, but it unfortunately doesn't work
from a performance angle. The performance model of XDP_REDIRECT into
CPUMAP relies on moving the expensive SKB allocation+init to a remote
CPU. This keeps the ingress CPU free to process packets at near line
rate (our DDoS use-case). If we allocate the SKB on the ingress-CPU
before the redirect, we destroy this load-balancing model and create the
exact bottleneck we designed CPUMAP to avoid.

iirc, a xdp prog can be attached to a cpumap. The skb can be created by that xdp prog running on the remote cpu. It should be like a xdp prog returning a XDP_PASS + an optional skb. The xdp prog can set some fields in the skb. Other than setting fields in the skb, something else may be also possible in the future, e.g. look up sk, earlier demux ...etc.


I have strong reservations about having the BPF program itself trigger
the SKB allocation. I believe this would fundamentally break the
performance model that makes cpumap redirect so effective.

The key to XDP's high performance lies in processing a bulk of
xdp_frames in a tight loop to amortize costs. The existing cpumap code
on the remote CPU is already highly optimized for this: it performs bulk
allocation of SKBs and uses careful prefetching to hide the memory
latency. Allowing a BPF program to sometimes trigger a heavyweight SKB
alloc+init (4 cache-line misses) would bypass all these existing
optimizations. It would introduce significant jitter into the pipeline
and disrupt the entire bulk-processing model we rely on for performance.

This performance is not just theoretical; we rely on it for DDoS
protection. For example, our plan is to use the XDP program on the
cpumap hook to run secondary DDoS mitigation rules that currently use
iptables (funny, many rules are actually BPF program snippets today).

Architecturally, there is a clean separation today: the BPF program
makes a decision, and the highly-optimized cpumap or core kernel code
acts on it (build_skb, napi_gro_receive, etc). Your proposal blurs this
line significantly. Our patch, in contrast, preserves this model. It
simply provides the necessary data (the hash, vlan and timestamp) to the
existing cpumap/veth skb path via the xdp_frame.

While more advanced capabilities are an interesting topic for the
future, my goal here is to solve the immediate, concrete problem of
transferring metadata cleanly, without disrupting the performance
architecture we rely on for use cases like DDoS mitigation.

--Jesper





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux