On Fri, May 30, 2025 at 10:10:50AM -0300, Jason Gunthorpe wrote: > On Thu, May 29, 2025 at 02:44:14PM -0700, Alex Mastro wrote: > > > We are wondering the following: > > - Is all of the above expected behavior, and usage of VFIO? > > - Is there an expected minimum alignment greater than 4K (our system page size) > > for non-MAP_FIXED mmap on a VFIO device fd? > > - Was there an unintended regression to our use-case in between 6.9 and 6.13? Probably due to aac6db75a9fc vfio/pci: Use unmap_mapping_range(). IIUC the plan was huge fault could bring back the lost perf, but indeed the alignment is still a challenge to at least always make right. > > I think this is something we have missed. VFIO should automatically > align the VMA's address if not MAP_FIXED, otherwise it can't use the > efficient huge page sizes anymore. qemu uses MAP_FIXED so we've left > out the non-qemu users from this performance optimization. > > To fix it, the flow from the mm side is something like what > shmem_get_unmapped_area() does. VFIO would probably want to align all > BAR's to their size. Good point! I overlooked the VA hints when QEMU doesn't need it. I can have a closer look if nobody else will. > > Which seems to me probably wants some refactoring and a core helper > 'mm_get_aligned_unmapped_area()'.. > > I think if you are mmaping a huge huge BAR it is not surprising that > it will take a huge amount of time to write out all of the 4K > PTEs. The stalls on old kernels should probably be addressed by having > cond_resched() inside the remap_pfnmap(). Right, but then that'll be a stable-only fix. If VFIO can provide a valid get_unmapped_area(), then with huge faults maybe we don't even need it, and such change can copy stable too. Meanwhile, just to mention there's one more commit that vfio huge_fault stable branches would like to have soon, that Alex fixed yet another alignment related issue to do reliable huge faults: commit c1d9dac0db168198b6f63f460665256dedad9b6e Author: Alex Williamson <alex.williamson@xxxxxxxxxx> Date: Fri May 2 16:40:31 2025 -0600 vfio/pci: Align huge faults to order I think if your trace shows correct huge faults when you did correct alignment, it should mean it doesn't affect your case (likely your app sequentially fault in the bar region.. meanwhile likely there's no concurrent, especially unaligned, faults when pre-fault everything). But just something FYI and IIUC that commit will land 6.13.z soon. Thanks, -- Peter Xu