Pablo Neira Ayuso <pablo@xxxxxxxxxxxxx> wrote: > > Not signed off as I don't see this as more elegant as v1 here: > > https://lore.kernel.org/netfilter-devel/20250704123024.59099-1-fw@xxxxxxxxx/ > > Not very elegant, maybe it is just incomplete. Its complete (both the sleepable-iter linked above and this RFC). The RFC omits conversion of NETSETELEM however. I did not spend time on that because I'm not sure I'm following your suggestion in the first place. > > Both DEL/NEWSETELEM would be changed to first peek the transaction list > > tail to see if a compatible transaction exists and re-use that instead > > of allocating a new one. > > Right. Would all this provide even more memory savings? Yes. The memory saving would come from no-need for elems[], except for update case. Atm we can store 124 elements in one nft_trans structure. Each nft_trans_elem has 2k size, to hold the pointers to the elements added/removed. But with list based approach you need one nft_trans struct only (96 byte) in ideal conditions (same op on same set, e.g. flush case). So its in the 96 byte vs. 163840 bytes range for 10.000 elem del/add. The downside is the permanent 8-byte per element increase. But, the truckload of temporary trans objects will be gone. > > Pablo, please let me know if you prefer this direction compared to v1. > > If so, I would also work on removing the trailing dynamically sized > > array from nft_trans_elem structure in a followup patch. > > I don't remember when precisely, but time ago, you mentioned something > like "this transaction infrastructure creates myriad of temporary > objects". Your dynamic array infrastructure made it better. > > Maybe it is time to integrate transaction infrastrcture more tightly > into the existing infrastructure, so there is not need to allocate so > many ancilliary objects for large sets? > > There is a trade-off in all this. Yes, there is. If you agree I will try to extend the RFC patchset: - add one patch to convert NEWSETELEM case - add one patch to get rid of elems[]. Unless you think there is a use case for update from userspace that makes a mass update of a set, such as modifying the timeout of 1k elements, then it would be better to keep elems[] and use it only for update case. Let me know what you think -- I don't think its a scenario worth optimizing for.