Re: nfs client and io_uring zero copy receive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> The only way you can avoid memory copies here is to use RDMA to allow
> the server to write its replies directly into the correct client read
> buffers.

I remounted with rdma

[root@23-127-77-6 ~]# mount -t nfs -o
proto=rdma,nconnect=16,rsize=4194304,wsize=4194304 192.168.0.7:/mnt
/mnt
[root@23-127-77-6 ~]# mount -v|grep -i rdma
192.168.0.7:/mnt on /mnt type nfs4
(rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,fatal_neterrors=none,proto=rdma,nconnect=16,port=20049,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.8,local_lock=none,addr=192.168.0.7)
[root@23-127-77-6 ~]#

and repeat sequential read.

According to perf top, memcpy is gone,

Samples: 64K of event 'cycles:P', 4000 Hz, Event count (approx.):
22510217633 lost: 0/0 drop: 0/0
Overhead  Shared Object                      Symbol
  13,12%  [nfs]                              [k] nfs_generic_pg_test
  11,32%  [nfs]                              [k] nfs_page_group_lock
  10,42%  [nfs]                              [k] nfs_clear_request
   5,41%  [kernel]                           [k] gup_fast_pte_range
   4,11%  [nfs]                              [k] nfs_page_group_sync_on_bit
   3,36%  [nfs]                              [k] nfs_page_create
   3,13%  [nfs]                              [k] __nfs_pageio_add_request
   2,10%  [nfs]                              [k] __nfs_find_lock_context

but it didn't improve read bandwidth at all.  Even slightly worse
compared to proto=tcp.

Anton

вт, 22 июл. 2025 г. в 21:43, Trond Myklebust <trondmy@xxxxxxxxxx>:
>
> On Tue, 2025-07-22 at 21:10 +0300, Anton Gavriliuk wrote:
> > Hi
> >
> > I am trying to exceed 20 GB/s doing sequential read from a single
> > file
> > on the nfs client.
> >
> > perf top shows excessive memcpy usage:
> >
> > Samples: 237K of event 'cycles:P', 4000 Hz, Event count (approx.):
> > 120872739112 lost: 0/0 drop: 0/0
> > Overhead  Shared Object                      Symbol
> >   20,54%  [kernel]                           [k] memcpy
> >    6,52%  [nfs]                              [k] nfs_generic_pg_test
> >    5,12%  [nfs]                              [k] nfs_page_group_lock
> >    4,92%  [kernel]                           [k] _copy_to_iter
> >    4,79%  [kernel]                           [k] gro_list_prepare
> >    2,77%  [nfs]                              [k] nfs_clear_request
> >    2,10%  [nfs]                              [k]
> > __nfs_pageio_add_request
> >    2,07%  [kernel]                           [k] check_heap_object
> >    2,00%  [kernel]                           [k] __slab_free
> >
> > Can nfs client be adopted to use zero copy ?, for example by using
> > io_uring zero copy rx.
> >
>
> The client has no idea in which order the server will return replies to
> the RPC calls it sends. So no, it can't queue up those reply buffers in
> advance.
>
> The only way you can avoid memory copies here is to use RDMA to allow
> the server to write its replies directly into the correct client read
> buffers.
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trondmy@xxxxxxxxxx, trond.myklebust@xxxxxxxxxxxxxxx





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux