Re: performance r nfsd with RWF_DONTCACHE and larger wsizes

Dave Chinner <david@xxxxxxxxxxxxx> · Wed, 7 May 2025 08:31:17 +1000

On Tue, May 06, 2025 at 01:40:35PM -0400, Jeff Layton wrote:
> FYI I decided to try and get some numbers with Mike's RWF_DONTCACHE
> patches for nfsd [1]. Those add a module param that make all reads and
> writes use RWF_DONTCACHE.
> 
> I had one host that was running knfsd with an XFS export, and a second
> that was acting as NFS client. Both machines have tons of memory, so
> pagecache utilization is irrelevant for this test.

Does RWF_DONTCACHE result in server side STABLE write requests from
the NFS client, or are they still unstable and require a post-write
completion COMMIT operation from the client to trigger server side
writeback before the client can discard the page cache?

> I tested sequential writes using the fio-seq_write.fio test, both with
> and without the module param enabled.
> 
> These numbers are from one run each, but they were pretty stable over
> several runs:
> 
> # fio /usr/share/doc/fio/examples/fio-seq-write.fio

$ cat /usr/share/doc/fio/examples/fio-seq-write.fio
cat: /usr/share/doc/fio/examples/fio-seq-write.fio: No such file or directory
$

What are the fio control parameters of the IO you are doing? (e.g.
is this single threaded IO, does it use the psync, libaio or iouring
engine, etc)

> wsize=1M:
> 
> Normal:      WRITE: bw=1034MiB/s (1084MB/s), 1034MiB/s-1034MiB/s (1084MB/s-1084MB/s), io=910GiB (977GB), run=901326-901326msec
> DONTCACHE:   WRITE: bw=649MiB/s (681MB/s), 649MiB/s-649MiB/s (681MB/s-681MB/s), io=571GiB (613GB), run=900001-900001msec
> 
> DONTCACHE with a 1M wsize vs. recent (v6.14-ish) knfsd was about 30%
> slower. Memory consumption was down, but these boxes have oodles of
> memory, so I didn't notice much change there.

So what is the IO pattern that the NFSD is sending to the underlying
XFS filesystem?

Is it sending 1M RWF_DONTCACHE buffered IOs to XFS as well (i.e. one
buffered write IO per NFS client write request), or is DONTCACHE
only being used on the NFS client side?

> I wonder if we need some heuristic that makes generic_write_sync() only
> kick off writeback immediately if the whole folio is dirty so we have
> more time to gather writes before kicking off writeback?

You're doing aligned 1MB IOs - there should be no partially dirty
large folios in either the client or the server page caches.

That said, this is part of the reason I asked about both whether the
client side write is STABLE and  whether RWF_DONTCACHE on
the server side. i.e. using either of those will trigger writeback
on the serer side immediately; in the case of the former it will
also complete before returning to the client and not require a
subsequent COMMIT RPC to wait for server side IO completion...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx