Re: [PATCH v4] vfio/type1: optimize vfio_pin_pages_remote() for large folio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 21 May 2025 13:17:11 -0600, alex.williamson@xxxxxxxxxx wrote:

>> From: Li Zhe <lizhe.67@xxxxxxxxxxxxx>
>> 
>> When vfio_pin_pages_remote() is called with a range of addresses that
>> includes large folios, the function currently performs individual
>> statistics counting operations for each page. This can lead to significant
>> performance overheads, especially when dealing with large ranges of pages.
>> 
>> This patch optimize this process by batching the statistics counting
>> operations.
>> 
>> The performance test results for completing the 8G VFIO IOMMU DMA mapping,
>> obtained through trace-cmd, are as follows. In this case, the 8G virtual
>> address space has been mapped to physical memory using hugetlbfs with
>> pagesize=2M.
>> 
>> Before this patch:
>> funcgraph_entry:      # 33813.703 us |  vfio_pin_map_dma();
>> 
>> After this patch:
>> funcgraph_entry:      # 16071.378 us |  vfio_pin_map_dma();
>> 
>> Signed-off-by: Li Zhe <lizhe.67@xxxxxxxxxxxxx>
>> Co-developed-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
>> Signed-off-by: Alex Williamson <alex.williamson@xxxxxxxxxx>
>> ---
>
>Given the discussion on v3, this is currently a Nak.  Follow-up in that
>thread if there are further ideas how to salvage this.  Thanks,

How about considering the solution David mentioned to check whether the
pages or PFNs are actually consecutive?

I have conducted a preliminary attempt, and the performance testing
revealed that the time consumption is approximately 18,000 microseconds.
Compared to the previous 33,000 microseconds, this also represents a
significant improvement.

The modification is quite straightforward. The code below reflects the
changes I have made based on this patch.

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index bd46ed9361fe..1cc1f76d4020 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -627,6 +627,19 @@ static long vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr,
        return ret;
 }
 
+static inline long continuous_page_num(struct vfio_batch *batch, long npage)
+{
+       long i;
+       unsigned long next_pfn = page_to_pfn(batch->pages[batch->offset]) + 1;
+
+       for (i = 1; i < npage; ++i) {
+               if (page_to_pfn(batch->pages[batch->offset + i]) != next_pfn)
+                       break;
+               next_pfn++;
+       }
+       return i;
+}
+
 /*
  * Attempt to pin pages.  We really don't want to track all the pfns and
  * the iommu can only map chunks of consecutive pfns anyway, so get the
@@ -708,8 +721,12 @@ static long vfio_pin_pages_remote(struct vfio_dma *dma, unsigned long vaddr,
                         */
                        nr_pages = min_t(long, batch->size, folio_nr_pages(folio) -
                                                folio_page_idx(folio, batch->pages[batch->offset]));
-                       if (nr_pages > 1 && vfio_find_vpfn_range(dma, iova, nr_pages))
-                               nr_pages = 1;
+                       if (nr_pages > 1) {
+                               if (vfio_find_vpfn_range(dma, iova, nr_pages))
+                                       nr_pages = 1;
+                               else
+                                       nr_pages = continuous_page_num(batch, nr_pages);
+                       }
 
                        /*
                         * Reserved pages aren't counted against the user,

Thanks,
Zhe




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux