Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote: > On Mon, Jun 23, 2025 at 11:50:58AM +0100, David Howells wrote: > > What's the best way to manage this without having to go back to the page > > struct for every DMA mapping we want to make? > > There isn't a very easy way. Also because if you actually need to do > peer to peer transfers, you right now absolutely need the page to find > the pgmap that has the information on how to perform the peer to peer > transfer. Are you expecting P2P to become particularly common? Because page struct lookups will become more expensive because we'll have to do type checking and Willy may eventually move them from a fixed array into a maple tree - so if we can record the P2P flag in the bio_vec, it would help speed up the "not P2P" case. > > Do we need to have > > iov_extract_user_pages() note this in the bio_vec? > > > > struct bio_vec { > > physaddr_t bv_base_addr; /* 64-bits */ > > size_t bv_len:56; /* Maybe just u32 */ > > bool p2pdma:1; /* Region is involved in P2P */ > > unsigned int spare:7; > > }; > > Having a flag in the bio_vec might be a way to shortcut the P2P or not > decision a bit. The downside is that without the flag, the bio_vec > in the brave new page-less world would actually just be: > > struct bio_vec { > phys_addr_t bv_phys; > u32 bv_len; > } __packed; > > i.e. adding any more information would actually increase the size from > 12 bytes to 16 bytes for the usualy 64-bit phys_addr_t setups, and thus > undo all the memory savings that this move would provide. Do we actually need 32 bits for bv_len, especially given that MAX_RW_COUNT is capped at a bit less than 2GiB? Could we, say, do: struct bio_vec { phys_addr_t bv_phys; u32 bv_len:31; u32 bv_use_p2p:1; } __packed; And rather than storing the how-to-do-P2P info in the page struct, does it make sense to hold it separately, keyed on bv_phys? Also, is it possible for the networking stack, say, to trivially map the P2P memory in order to checksum it? I presume bv_phys in that case would point to a mapping of device memory? Thanks, David