On Mon, Aug 25, 2025 at 2:29 PM Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote: > > On 08/25, Amery Hung wrote: > > Add kfunc, bpf_xdp_pull_data(), to support pulling data from xdp > > fragments. Similar to bpf_skb_pull_data(), bpf_xdp_pull_data() makes > > the first len bytes of data directly readable and writable in bpf > > programs. If the "len" argument is larger than the linear data size, > > data in fragments will be copied to the linear region when there > > is enough room between xdp->data_end and xdp_data_hard_end(xdp), > > which is subject to driver implementation. > > > > A use case of the kfunc is to decapsulate headers residing in xdp > > fragments. It is possible for a NIC driver to place headers in xdp > > fragments. To keep using direct packet access for parsing and > > decapsulating headers, users can pull headers into the linear data > > area by calling bpf_xdp_pull_data() and then pop the header with > > bpf_xdp_adjust_head(). > > > > An unused argument, flags is reserved for future extension (e.g., > > tossing the data instead of copying it to the linear data area). > > > > Signed-off-by: Amery Hung <ameryhung@xxxxxxxxx> > > --- > > net/core/filter.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 52 insertions(+) > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > index f0ee5aec7977..82d953e077ac 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -12211,6 +12211,57 @@ __bpf_kfunc int bpf_sock_ops_enable_tx_tstamp(struct bpf_sock_ops_kern *skops, > > return 0; > > } > > > > +__bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len, u64 flags) > > +{ > > + struct xdp_buff *xdp = (struct xdp_buff *)x; > > + struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp); > > + void *data_end, *data_hard_end = xdp_data_hard_end(xdp); > > + int i, delta, buff_len, n_frags_free = 0, len_free = 0; > > + > > + buff_len = xdp_get_buff_len(xdp); > > + > > + if (unlikely(len > buff_len)) > > + return -EINVAL; > > + > > + if (!len) > > + len = xdp_get_buff_len(xdp); > > Why not return -EINVAL here for len=0? > I try to mirror the behavior of bpf_skb_pull_data() to reduce confusion here. > > + > > + data_end = xdp->data + len; > > + delta = data_end - xdp->data_end; > > + > > + if (delta <= 0) > > + return 0; > > + > > + if (unlikely(data_end > data_hard_end)) > > + return -EINVAL; > > + > > + for (i = 0; i < sinfo->nr_frags && delta; i++) { > > + skb_frag_t *frag = &sinfo->frags[i]; > > + u32 shrink = min_t(u32, delta, skb_frag_size(frag)); > > + > > + memcpy(xdp->data_end + len_free, skb_frag_address(frag), shrink); > > skb_frag_address can return NULL for unreadable frags. Is it safe to assume that drivers will ensure frags to be readable? It seems at least mlx5 does. I did a quick check and found other xdp kfuncs using skb_frag_address() without checking the return. Thanks for reviewing!