FYI I decided to try and get some numbers with Mike's RWF_DONTCACHE patches for nfsd [1]. Those add a module param that make all reads and writes use RWF_DONTCACHE. I had one host that was running knfsd with an XFS export, and a second that was acting as NFS client. Both machines have tons of memory, so pagecache utilization is irrelevant for this test. I tested sequential writes using the fio-seq_write.fio test, both with and without the module param enabled. These numbers are from one run each, but they were pretty stable over several runs: # fio /usr/share/doc/fio/examples/fio-seq-write.fio wsize=1M: Normal: WRITE: bw=1034MiB/s (1084MB/s), 1034MiB/s-1034MiB/s (1084MB/s-1084MB/s), io=910GiB (977GB), run=901326-901326msec DONTCACHE: WRITE: bw=649MiB/s (681MB/s), 649MiB/s-649MiB/s (681MB/s-681MB/s), io=571GiB (613GB), run=900001-900001msec DONTCACHE with a 1M wsize vs. recent (v6.14-ish) knfsd was about 30% slower. Memory consumption was down, but these boxes have oodles of memory, so I didn't notice much change there. Chris suggested that the write sizes were too small in this test, so I grabbed Chuck's patches to increase the max RPC payload size [2] to 4M, and patched the client to allow a wsize that big: wsize=4M: Normal: WRITE: bw=1053MiB/s (1104MB/s), 1053MiB/s-1053MiB/s (1104MB/s-1104MB/s), io=930GiB (999GB), run=904526-904526msec DONTCACHE: WRITE: bw=1191MiB/s (1249MB/s), 1191MiB/s-1191MiB/s (1249MB/s-1249MB/s), io=1050GiB (1127GB), run=902781-902781msec Not much change with normal buffered I/O here, but DONTCACHE is faster with a 4M wsize. My suspicion (unconfirmed) is that the dropbehind flag ends up causing partially-written large folios in the pagecache to get written back too early, and that slows down later writes to the same folios. I wonder if we need some heuristic that makes generic_write_sync() only kick off writeback immediately if the whole folio is dirty so we have more time to gather writes before kicking off writeback? This might also be a good reason to think about a larger rsize/wsize limit in the client. I'd like to also test reads with this flag, but I'm currently getting back that EOPNOTSUPP error when I try to test them. [1]: https://lore.kernel.org/linux-nfs/20250220171205.12092-1- snitzer@xxxxxxxxxx/ [2]: https://lore.kernel.org/linux-nfs/20250428193702.5186-15- cel@xxxxxxxxxx/ -- Jeff Layton <jlayton@xxxxxxxxxx>