Re: Report: Performance regression from ib_umem_get on zone device pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 4/24/2025 5:01 AM, Jason Gunthorpe wrote:
On Wed, Apr 23, 2025 at 10:35:06PM -0700, jane.chu@xxxxxxxxxx wrote:

On 4/23/2025 4:28 PM, Jason Gunthorpe wrote:
The flow of a single test run:
    1. reserve virtual address space for (61440 * 2MB) via mmap with PROT_NONE
and MAP_ANONYMOUS | MAP_NORESERVE| MAP_PRIVATE
    2. mmap ((61440 * 2MB) / 12) from each of the 12 device-dax to the
reserved virtual address space sequentially to form a continual VA
space
Like is there any chance that each of these 61440 VMA's is a single
2MB folio from device-dax, or could it be?

IIRC device-dax does could not use folios until 6.15 so I'm assuming
it is not folios even if it is a pmd mapping?

I just ran the mr registration stress test in 6.15-rc3, much better!

What's changed?  is it folio for device-dax?  none of the code in
ib_umem_get() has changed though, it still loops through 'npages' doing

I don't know, it is kind of strange that it changed. If device-dax is
now using folios then it does change the access pattern to the struct
page array somewhat, especially it moves all the writes to the head
page of the 2MB section which maybe impacts the the caching?

6.15-rc3 is orders of magnitude better.
Agreed that device-dax's using folio are likely the heros. I've yet to check the code and bisect, maybe pin_user_page_fast() adds folios to page_list[] instead of 4K pages? if so, with 511/512 size reduction in page_list[], that could drastically improve the dowstream call performance in spite of the thrashing, that is, if thrashing is still there.

I'll report my findings.

Thanks,
-jane


Jason





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux