On 21.05.25 06:25, lizhe.67@xxxxxxxxxxxxx wrote:
From: Li Zhe <lizhe.67@xxxxxxxxxxxxx> When vfio_pin_pages_remote() is called with a range of addresses that includes large folios, the function currently performs individual statistics counting operations for each page. This can lead to significant performance overheads, especially when dealing with large ranges of pages. This patch optimize this process by batching the statistics counting operations. The performance test results for completing the 8G VFIO IOMMU DMA mapping, obtained through trace-cmd, are as follows. In this case, the 8G virtual address space has been mapped to physical memory using hugetlbfs with pagesize=2M. Before this patch: funcgraph_entry: # 33813.703 us | vfio_pin_map_dma(); After this patch: funcgraph_entry: # 16071.378 us | vfio_pin_map_dma(); Signed-off-by: Li Zhe <lizhe.67@xxxxxxxxxxxxx> Co-developed-by: Alex Williamson <alex.williamson@xxxxxxxxxx> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx> --- Changelogs: v3->v4: - Use min_t() to obtain the step size, rather than min(). - Fix some issues in commit message and title.
It's usually a good idea to wait with re submissions until the discussions on the previous version have ended.
-- Cheers, David / dhildenb