From: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> Date: Thu, 24 Apr 2025 14:48:39 +0200 > On Tue, Apr 15, 2025 at 07:28:12PM +0200, Alexander Lobakin wrote: >> Start adding XDP-specific code to libeth, namely handling XDP_TX buffers >> (only sending). [...] >> +static __always_inline u32 >> +libeth_xdp_tx_xmit_bulk(const struct libeth_xdp_tx_frame *bulk, void *xdpsq, >> + u32 n, bool unroll, u64 priv, >> + u32 (*prep)(void *xdpsq, struct libeth_xdpsq *sq), >> + struct libeth_xdp_tx_desc >> + (*fill)(struct libeth_xdp_tx_frame frm, u32 i, >> + const struct libeth_xdpsq *sq, u64 priv), >> + void (*xmit)(struct libeth_xdp_tx_desc desc, u32 i, >> + const struct libeth_xdpsq *sq, u64 priv)) >> +{ >> + u32 this, batched, off = 0; >> + struct libeth_xdpsq sq; >> + u32 ntu, i = 0; >> + >> + n = min(n, prep(xdpsq, &sq)); >> + if (unlikely(!n)) >> + return 0; >> + >> + ntu = *sq.ntu; >> + >> + this = sq.count - ntu; > > maybe something more self-descriptive than 'this'? :) > this is available space in sq, right? 'this' means "this batch", IOW what we'll send for sure this iteration. > >> + if (likely(this > n)) >> + this = n; >> + >> +again: >> + if (!unroll) > > who takes this decision? a caller or is this dependent on some constraints > of underlying system? when would you like to not unroll? XDP_TX, ndo_xdp_xmit, XSk XDP_TX wrappers pass `false` here, only XSk xmit passes `true`. In cases other than XSk xmit, I had no positive impact, while the object code bloat was huge -- XSk xmit doesn't fill &libeth_sqe, only a Tx descriptor, while all the rest do. > >> + goto linear; >> + >> + batched = ALIGN_DOWN(this, LIBETH_XDP_TX_BATCH); >> + >> + for ( ; i < off + batched; i += LIBETH_XDP_TX_BATCH) { >> + u32 base = ntu + i - off; >> + >> + unrolled_count(LIBETH_XDP_TX_BATCH) >> + for (u32 j = 0; j < LIBETH_XDP_TX_BATCH; j++) >> + xmit(fill(bulk[i + j], base + j, &sq, priv), >> + base + j, &sq, priv); >> + } >> + >> + if (batched < this) { >> +linear: >> + for ( ; i < off + this; i++) >> + xmit(fill(bulk[i], ntu + i - off, &sq, priv), >> + ntu + i - off, &sq, priv); >> + } >> + >> + ntu += this; >> + if (likely(ntu < sq.count)) >> + goto out; >> + >> + ntu = 0; >> + >> + if (i < n) { >> + this = n - i; >> + off = i; >> + >> + goto again; >> + } >> + >> +out: >> + *sq.ntu = ntu; >> + *sq.pending += n; >> + if (sq.xdp_tx) >> + *sq.xdp_tx += n; >> + >> + return n; >> +} [...] >> +/** >> + * __libeth_xdp_tx_flush_bulk - internal helper to flush one XDP Tx bulk >> + * @bq: bulk to flush >> + * @flags: XDP TX flags >> + * @prep: driver-specific callback to prepare the queue for sending >> + * @fill: libeth_xdp callback to fill &libeth_sqe and &libeth_xdp_tx_desc > > Could you explain why this has to be implemented as a callback? for now > this might just be directly called within libeth_xdp_tx_xmit_bulk() ? > > What I currently understand is this is not something that driver would > provide. If its libeth internal routine then call this directly and > simplify the code. XSk XDP_TX passes a different callback here :> Anyway, all callbacks within libeth_xdp get inlined or (sometimes) converted to direct calls, no indirections. > > Besides, thanks a lot for this series and split up! I think we're on a > good path. Thanks, Olek