On Tue, Jul 29, 2025 at 01:37:19PM +0200, Florian Westphal wrote: > Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > DELSETELEM does not unlink elements from set in the preparation phase, > > instead elements are marked as inactive in the next generation but > > they still remain linked to the set. These elements are removed from > > the set from either the commit/abort phase. > > > > - flush should skip elements that are already inactive > > - flush should not work on deleted sets. > > - flush command (elements are marked as inactive) then delete set > > skips those elements that are inactive. So abort path can unwind > > accordingly using the transaction id marker what I am proposing. > > Yes, that part works, but we still need to kfree the elements after unlink. > > When commit phase does the unlink, the element becomes unreachable from > the set. At this time, the DELSETELEM object keeps a pointer to the > unlinked elements, and that allows us to kfree after synchronize_rcu > from the worker. If we don't want DELSETELEM for flush, we need to > provide the address to free by other means, e.g. stick a pointer into > struct nft_set_ext. For the commit phase, I suggest to add a list of dying elements to the transaction object. After unlinking the element from the (internal) set data structure, add it to this transaction dying list so it remains reachable to be released after the rcu grace period.