Re: [PATCH RFC 0/2] nfsd: issue POSIX_FADV_DONTNEED after READ/WRITE/COMMIT

Jeff Layton <jlayton@xxxxxxxxxx> · Sat, 05 Jul 2025 07:32:58 -0400

On Fri, 2025-07-04 at 09:16 +1000, NeilBrown wrote:
> On Fri, 04 Jul 2025, Jeff Layton wrote:
> > Chuck and I were discussing RWF_DONTCACHE and he suggested that this
> > might be an alternate approach. My main gripe with DONTCACHE was that it
> > kicks off writeback after every WRITE operation. With NFS, we generally
> > get a COMMIT operation at some point. Allowing us to batch up writes
> > until that point has traditionally been considered better for
> > performance.
> 
> I wonder if that traditional consideration is justified, give your
> subsequent results.  The addition of COMMIT in v3 allowed us to both:
>  - delay kicking off writes
>  - not wait for writes to complete
> 
> I think the second was always primary.  Maybe we didn't consider the
> value of the first enough.
> Obviously the client caches writes and delays the start of writeback.
> Adding another delay on the serve side does not seem to have a clear
> justification.  Maybe we *should* kick-off writeback immediately.  There
> would still be opportunity for subsequent WRITE requests to be merged
> into the writeback queue.
> 

That is the fundamental question: should we delay writeback or not? It
seems like delaying it is probably best, even in the modern era with
SSDs, but we do need more numbers here (ideally across a range of
workloads).

> Ideally DONTCACHE should only affect cache usage and the latency of
> subsequence READs.  It shouldn't affect WRITE behaviour.
> 

It definitely does affect it today. The ideal thing IMO would be to
just add the dropbehind flag to the folios on writes but not call
filemap_fdatawrite_range_kick() on every write operation.

After a COMMIT the pages should be clean and the vfs_fadvise call
should just drop them from the cache, so this approach shouldn't
materially change how writeback behaves.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>