On 07/02, Maciej Fijalkowski wrote: > On Wed, Jul 02, 2025 at 12:16:48PM +0200, Maciej Fijalkowski wrote: > > Eryk reported an issue that I have put under Closes: tag, related to > > umem addrs being prematurely produced onto pool's completion queue. > > Let us make the skb's destructor responsible for producing all addrs > > that given skb used. > > > > Commit from fixes tag introduced the buggy behavior, it was not broken > > from day 1, but rather when xsk multi-buffer got introduced. > > > > Store addrs at the beginning of skb's linear part and have a sanity > > check if in any case driver would encapsulate headers in a way that data > > would overwrite the [head, head + sizeof(xdp_desc::addr) * > > (MAX_SKB_FRAGS + 1)] region, which we dedicate for umem addresses that > > will be produced onto xsk_buff_pool's completion queue. > > > > This approach appears to survive scenario where underlying driver > > linearizes the skb because pskb_pull_tail() under the hood will copy > > header part to newly allocated memory. If this array would live in > > tailroom it would get overridden when pulling frags onto linear part. > > This happens when driver receives skb with frag count higher than what > > HW is able to swallow (I came across this case on ice driver that has > > maximum s/g count equal to 8). > > > > Initially we also considered storing 8-byte addr at the end of page > > allocated by frag but xskxceiver has a test which writes full 4k to frag > > and this resulted in corrupted addr. > > > > xsk_cq_submit_addr_locked() has to use xsk_get_num_desc() to find out > > frag count as skb that we deal with within destructor might not have the > > frags at all - as mentioned earlier drivers in their ndo_start_xmit() > > might linearize the skb. We will not use cached_prod to update > > producer's global state as its value might already have been increased, > > which would result in too many addresses being submitted onto cq. > > > > Fixes: b7f72a30e9ac ("xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path") > > Reported-by: Eryk Kubanski <e.kubanski@xxxxxxxxxxxxxxxxxxx> > > Closes: https://lore.kernel.org/netdev/20250530103456.53564-1-e.kubanski@xxxxxxxxxxxxxxxxxxx/ > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> > > --- > > net/xdp/xsk.c | 92 +++++++++++++++++++++++++++++++-------------- > > net/xdp/xsk_queue.h | 12 ++++++ > > 2 files changed, 75 insertions(+), 29 deletions(-) > > > > There's a CI failure regarding xsk metadata selftest which I didn't run on > my side, I focused on xdpsock+xskceiver, so I'll be taking a look into > that plus I think we can avoid skb headroom hack by allocating struct with > num_desc + addrs array and carry it via destructor_arg. +1 on making it more explicit. Maybe we can pre-allocate extra array (with an element per tx descriptor slot) to hold the extra info we need? And then pass the pointer to it via descriptor_arg.