On Fri, 2025-07-04 at 09:16 +1000, NeilBrown wrote: > On Fri, 04 Jul 2025, Jeff Layton wrote: > > Chuck and I were discussing RWF_DONTCACHE and he suggested that this > > might be an alternate approach. My main gripe with DONTCACHE was that it > > kicks off writeback after every WRITE operation. With NFS, we generally > > get a COMMIT operation at some point. Allowing us to batch up writes > > until that point has traditionally been considered better for > > performance. > > I wonder if that traditional consideration is justified, give your > subsequent results. The addition of COMMIT in v3 allowed us to both: > - delay kicking off writes > - not wait for writes to complete > > I think the second was always primary. Maybe we didn't consider the > value of the first enough. > Obviously the client caches writes and delays the start of writeback. > Adding another delay on the serve side does not seem to have a clear > justification. Maybe we *should* kick-off writeback immediately. There > would still be opportunity for subsequent WRITE requests to be merged > into the writeback queue. > That is the fundamental question: should we delay writeback or not? It seems like delaying it is probably best, even in the modern era with SSDs, but we do need more numbers here (ideally across a range of workloads). > Ideally DONTCACHE should only affect cache usage and the latency of > subsequence READs. It shouldn't affect WRITE behaviour. > It definitely does affect it today. The ideal thing IMO would be to just add the dropbehind flag to the folios on writes but not call filemap_fdatawrite_range_kick() on every write operation. After a COMMIT the pages should be clean and the vfs_fadvise call should just drop them from the cache, so this approach shouldn't materially change how writeback behaves. -- Jeff Layton <jlayton@xxxxxxxxxx>