On Fri, 30 May 2025 10:25:01 -0400 Peter Xu <peterx@xxxxxxxxxx> wrote: > On Fri, May 30, 2025 at 10:10:50AM -0300, Jason Gunthorpe wrote: > Probably due to aac6db75a9fc vfio/pci: Use unmap_mapping_range(). Ack. > > I think this is something we have missed. VFIO should automatically > > align the VMA's address if not MAP_FIXED, otherwise it can't use the > > efficient huge page sizes anymore. qemu uses MAP_FIXED so we've left > > out the non-qemu users from this performance optimization. Thanks for confirming. > Good point! I overlooked the VA hints when QEMU doesn't need it. I can > have a closer look if nobody else will. This would be appreciated -- thank you! > > I think if you are mmaping a huge huge BAR it is not surprising that > > it will take a huge amount of time to write out all of the 4K > > PTEs. Agreed. This matches what we observed. > I think if your trace shows correct huge faults when you did correct > alignment, it should mean it doesn't affect your case (likely your app > sequentially fault in the bar region. Yes, this is the faulting triggered by the call stack below, downstream from VFIO_IOMMU_MAP_DMA, which faults in the entire VA range to be mapped. vfio_pci_mmap_huge_fault+0xf5/0x1b0 [vfio_pci_core] __do_fault+0x3f/0x130 do_pte_missing+0x363/0xf40 handle_mm_fault+0x6d2/0x1200 fixup_user_fault+0x121/0x280 vaddr_get_pfns+0x185/0x3c0 [vfio_iommu_type1] vfio_pin_pages_remote+0x1a1/0x590 [vfio_iommu_type1] vfio_pin_map_dma+0xe6/0x2c0 [vfio_iommu_type1] vfio_iommu_type1_ioctl+0xd32/0xea0 [vfio_iommu_type1] I also confirmed that cherry picking "vfio/pci: Align huge faults to order" does not affect our usage of this path (manual mmap alignment is still required). Thanks, Alex Mastro