On Tue, 20 May 2025 19:41:19 +0200 David Hildenbrand <david@xxxxxxxxxx> wrote: > On 20.05.25 18:59, Peter Xu wrote: > > Hi, Alex, > > > > On Tue, May 20, 2025 at 08:07:19AM -0600, Alex Williamson wrote: > >> Peter, David, if you wouldn't mind double checking the folio usage > >> here, I'd appreciate it. The underlying assumption used here is that > >> folios always have physically contiguous pages, so we can increment at > >> the remainder of the folio_nr_pages() rather than iterate each page. > > > > Yes I think so. E.g., there's comment above folio definition too: > > It has consecutive PFNs, yes (i.e., pfn++). The "struct page" might not > be consecutive (i.e., page++ does not work for larger folios). The former, contiguous PFNs is all we need here. We're feeding the IOMMU mapping, so we're effectively just looking for the largest extent of contiguous PFNs for mapping a given IOVA. The struct page is really just for GUP, finding the next pfn, and with this, finding the offset into the large folio. > > /** > > * struct folio - Represents a contiguous set of bytes. > > * ... > > * A folio is a physically, virtually and logically contiguous set > > * of bytes... > > */ > > > > For 1G, I wonder if in the future vfio can also use memfd_pin_folios() > > internally when possible, e.g. after stumbled on top of a hugetlb folio > > when filling the batch. > > Yeah, or have a better GUP interface that gives us folio ranges instead > of individual pages. > > Using memfd directly is obviously better where possible. Yeah, we brought up some of these issues during previous reviews. Ultimately we want to move to IOMMUFD, which already has these features, but it still lacks P2P DMA mapping and isn't as well supported by various management stacks. This leaves some performance on the table, but has a pretty high return for a relatively trivial change. Thanks, Alex