On Wed, May 28, 2025 at 8:11 PM Byungchul Park <byungchul@xxxxxx> wrote: > > To simplify struct page, the page pool members of struct page should be > moved to other, allowing these members to be removed from struct page. > > Introduce a network memory descriptor to store the members, struct > netmem_desc and make it union'ed with the existing fields in struct > net_iov, allowing to organize the fields of struct net_iov. The final > look of struct net_iov should be like: > > struct net_iov { > struct netmem_desc; > net_field1; /* e.g. struct net_iov_area *owner; */ > net_field2; > ... > }; > > Signed-off-by: Byungchul Park <byungchul@xxxxxx> This version looks much better from networking POV, and I think with small adjustments we can make it work I think. I'll leave Matthew to confirm if it's aligned with the memdesc plans. > --- > include/net/netmem.h | 101 +++++++++++++++++++++++++++++++++---------- > 1 file changed, 79 insertions(+), 22 deletions(-) > > diff --git a/include/net/netmem.h b/include/net/netmem.h > index 386164fb9c18..d52f86082271 100644 > --- a/include/net/netmem.h > +++ b/include/net/netmem.h > @@ -12,6 +12,47 @@ > #include <linux/mm.h> > #include <net/net_debug.h> > > +/* These fields in struct page are used by the page_pool and net stack: > + * > + * struct { > + * unsigned long pp_magic; > + * struct page_pool *pp; > + * unsigned long _pp_mapping_pad; > + * unsigned long dma_addr; > + * atomic_long_t pp_ref_count; > + * }; > + * > + * We mirror the page_pool fields here so the page_pool can access these > + * fields without worrying whether the underlying fields belong to a > + * page or netmem_desc. > + */ > +struct netmem_desc { > + unsigned long _flags; > + unsigned long pp_magic; > + struct page_pool *pp; > + unsigned long _pp_mapping_pad; > + unsigned long dma_addr; > + atomic_long_t pp_ref_count; > +}; > + > +#define NETMEM_DESC_ASSERT_OFFSET(pg, desc) \ > + static_assert(offsetof(struct page, pg) == \ > + offsetof(struct netmem_desc, desc)) > +NETMEM_DESC_ASSERT_OFFSET(flags, _flags); > +NETMEM_DESC_ASSERT_OFFSET(pp_magic, pp_magic); > +NETMEM_DESC_ASSERT_OFFSET(pp, pp); > +NETMEM_DESC_ASSERT_OFFSET(_pp_mapping_pad, _pp_mapping_pad); > +NETMEM_DESC_ASSERT_OFFSET(dma_addr, dma_addr); > +NETMEM_DESC_ASSERT_OFFSET(pp_ref_count, pp_ref_count); > +#undef NETMEM_DESC_ASSERT_OFFSET > + > +/* > + * Since struct netmem_desc uses the space in struct page, the size > + * should be checked, until struct netmem_desc has its own instance from > + * slab, to avoid conflicting with other members within struct page. > + */ > +static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount)); > + > /* net_iov */ > > DECLARE_STATIC_KEY_FALSE(page_pool_mem_providers); > @@ -31,12 +72,33 @@ enum net_iov_type { > }; > > struct net_iov { > - enum net_iov_type type; > - unsigned long pp_magic; > - struct page_pool *pp; > - struct net_iov_area *owner; > - unsigned long dma_addr; > - atomic_long_t pp_ref_count; > + union { > + struct netmem_desc desc; > + > + /* XXX: The following part should be removed once all > + * the references to them are converted so as to be > + * accessed via netmem_desc e.g. niov->desc.pp instead > + * of niov->pp. > + * > + * Plus, once struct netmem_desc has it own instance > + * from slab, network's fields of the following can be > + * moved out of struct netmem_desc like: > + * > + * struct net_iov { > + * struct netmem_desc desc; > + * struct net_iov_area *owner; > + * ... > + * }; > + */ We do not need to wait until netmem_desc has its own instance from slab to move the net_iov-specific fields out of netmem_desc. We can do that now, because there are no size restrictions on net_iov. So, I recommend change this to: struct net_iov { /* Union for anonymous aliasing: */ union { struct netmem_desc desc; struct { unsigned long _flags; unsigned long pp_magic; struct page_pool *pp; unsigned long _pp_mapping_pad; unsigned long dma_addr; atomic_long_t pp_ref_count; }; struct net_iov_area *owner; enum net_iov_type type; }; > > struct net_iov_area { > @@ -48,27 +110,22 @@ struct net_iov_area { > unsigned long base_virtual; > }; > > -/* These fields in struct page are used by the page_pool and net stack: > +/* net_iov is union'ed with struct netmem_desc mirroring struct page, so > + * the page_pool can access these fields without worrying whether the > + * underlying fields are accessed via netmem_desc or directly via > + * net_iov, until all the references to them are converted so as to be > + * accessed via netmem_desc e.g. niov->desc.pp instead of niov->pp. > * > - * struct { > - * unsigned long pp_magic; > - * struct page_pool *pp; > - * unsigned long _pp_mapping_pad; > - * unsigned long dma_addr; > - * atomic_long_t pp_ref_count; > - * }; > - * > - * We mirror the page_pool fields here so the page_pool can access these fields > - * without worrying whether the underlying fields belong to a page or net_iov. > - * > - * The non-net stack fields of struct page are private to the mm stack and must > - * never be mirrored to net_iov. > + * The non-net stack fields of struct page are private to the mm stack > + * and must never be mirrored to net_iov. > */ > -#define NET_IOV_ASSERT_OFFSET(pg, iov) \ > - static_assert(offsetof(struct page, pg) == \ > +#define NET_IOV_ASSERT_OFFSET(desc, iov) \ > + static_assert(offsetof(struct netmem_desc, desc) == \ > offsetof(struct net_iov, iov)) > +NET_IOV_ASSERT_OFFSET(_flags, type); Remove this assertion. > NET_IOV_ASSERT_OFFSET(pp_magic, pp_magic); > NET_IOV_ASSERT_OFFSET(pp, pp); > +NET_IOV_ASSERT_OFFSET(_pp_mapping_pad, owner); And this one. (_flags, type) and (_pp_mapping_pad, owner) have very different semantics and usage, we should not assert they are not the same offset. However (pp, pp) and (pp_magic,pp_magic) have the same semantics and usage, so we do assert they are at the same offset. Code is allowed to access __netmem_clear_lsb(netmem)->pp or __netmem_clear_lsb(netmem)->pp_magic without caring what's the underlying memory type because both fields have the same semantics and usage. Code should *not* assume it can access __netmem_clear_lsb(netmem)->owner or __netmem_clear_lsb(netmem)->type without doing a check whether the underlying memory is page/netmem_desc or net_iov. These fields are only usable for net_iov, so let's explicitly move them to a different place. -- Thanks, Mina