Re: [PATCH bpf-next V2 0/7] xdp: Allow BPF to set RX hints for XDP_REDIRECTed packets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 






On 18/07/2025 03.25, Jakub Kicinski wrote:
On Thu, 17 Jul 2025 15:08:49 +0200 Jesper Dangaard Brouer wrote:
Let me explain why it is a bad idea of writing into the RX descriptors.
The DMA descriptors are allocated as coherent DMA (dma_alloc_coherent).
This is memory that is shared with the NIC hardware device, which
implies cache-line coherence.  NIC performance is tightly coupled to
limiting cache misses for descriptors.  One common trick is to pack more
descriptors into a single cache-line.  Thus, if we start to write into
the current RX-descriptor, then we invalidate that cache-line seen from
the device, and next RX-descriptor (from this cache-line) will be in an
unfortunate coherent state.  Behind the scene this might lead to some
extra PCIe transactions.

By writing to the xdp_frame, we don't have to modify the DMA descriptors
directly and risk invalidating cache lines for the NIC.

I thought you main use case is redirected packets. In which case it's
the _remote_ end that's writing its metadata, if it's veth it's
obviously not going to be doing it into DMA coherent memory.

My apologies for the confusion. That entire explanation about the
dangers of writing to RX descriptors was a direct response to
Stanislav's earlier proposal (for the XDP_PASS case, I assume).

You are right that this isn't relevant for redirected xdp_frames,
as there is no access to the original RX-descriptor on a remote CPU or
target device like veth.


Thanks for the feedback. I can see why you'd be concerned about adding
another adhoc scheme or making xdp_frame grow into a "para-skb".

However, I'd like to frame this as part of a long-term plan we've been
calling the "mini-SKB" concept. This isn't a new idea, but a
continuation of architectural discussions from as far back as [2016].

My understanding is that while this was floated as a plan by some,
nobody came up with a clean way of implementing it.

I can see why you might think that, but from my perspective, the
xdp_frame *is* the implementation of the mini-SKB concept. We've been
building it incrementally for years. It started as the most minimal
structure possible and has gradually gained more context (e.g. dev_rx,
mem_info/rxq_info, flags, and also uses skb_shared_info with same layout
as SKB).

This patch is simply the next logical step in that existing evolution:
adding hardware metadata to make it more capable, starting with enabling
XDP_REDIRECT offloads. The xdp_frame is our mini-SKB, and this patchset
continues its evolution.

--Jesper




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux