On Tue, Aug 19, 2025 at 09:10:42PM +0200, Florian Westphal wrote: > Florian Westphal <fw@xxxxxxxxx> wrote: > > > For the commit phase, I suggest to add a list of dying elements to the > > > transaction object. After unlinking the element from the (internal) > > > set data structure, add it to this transaction dying list so it > > > remains reachable to be released after the rcu grace period. > > > > Thats what I meant by 'stick a pointer into struct nft_set_ext'. > > Its awkward but I should be able to get the priv pointer back > > by doing the inverse of nft_set_elem_ext(). > > > > The cleaner solution would be to turn nft_elem_priv into a > > 'nft_elem_common', place a hlist_node into that and then > > use container_of(). But its too much code churn for my > > liking. > > > > So I'll extend each set element with a pointer and > > add a removed_elements hlist_head to struct nft_trans_elem. > > > > The transacion id isn't needed I think once that list exist: > > it provides the needed info to undo previous operations > > without the need to walk the set again. > > > > We can probably even rework struct nft_trans_elem to always use > > this pointer, even for inserts, and only use the > > > > struct nft_trans_one_elem elems[] > > > > member for elements that we update (no add or removal). > > But thats something for a later time. > > This doesn't work. > NEWSETELEM cannot (re)use the same list node as DELSETELEM. > > Reason is that a set flush will also flush elements > added in the same batch. > > But if NEWSETELEM uses a list (instead of priv pointer > as we do now), then at the time of the set flush, the > encountered element is already on a NEWSETELEM trans_elem list. > > I'll try doing: > > struct nft_set_ext { > u8 genmask; > u8 offset[NFT_SET_EXT_NUM]; > + struct llist_node trans_list_new; > + struct llist_node trans_list_del; > char data[]; > > to avoid this problem. Hm, I think this is not looking good. I am considering it is better to take your patch by now, then postpone explore further memory consumption reduction at a later stage. Thanks for addressing my suggestion, let me know if you prefer this path, I apologize for delaying your original proposal.