On Tue, 2025-07-22 at 21:10 +0300, Anton Gavriliuk wrote: > Hi > > I am trying to exceed 20 GB/s doing sequential read from a single > file > on the nfs client. > > perf top shows excessive memcpy usage: > > Samples: 237K of event 'cycles:P', 4000 Hz, Event count (approx.): > 120872739112 lost: 0/0 drop: 0/0 > Overhead Shared Object Symbol > 20,54% [kernel] [k] memcpy > 6,52% [nfs] [k] nfs_generic_pg_test > 5,12% [nfs] [k] nfs_page_group_lock > 4,92% [kernel] [k] _copy_to_iter > 4,79% [kernel] [k] gro_list_prepare > 2,77% [nfs] [k] nfs_clear_request > 2,10% [nfs] [k] > __nfs_pageio_add_request > 2,07% [kernel] [k] check_heap_object > 2,00% [kernel] [k] __slab_free > > Can nfs client be adopted to use zero copy ?, for example by using > io_uring zero copy rx. > The client has no idea in which order the server will return replies to the RPC calls it sends. So no, it can't queue up those reply buffers in advance. The only way you can avoid memory copies here is to use RDMA to allow the server to write its replies directly into the correct client read buffers. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trondmy@xxxxxxxxxx, trond.myklebust@xxxxxxxxxxxxxxx