On Fri, 04 Jul 2025, Chuck Lever wrote: > On 7/3/25 7:16 PM, NeilBrown wrote: > > On Fri, 04 Jul 2025, Jeff Layton wrote: > >> Chuck and I were discussing RWF_DONTCACHE and he suggested that this > >> might be an alternate approach. My main gripe with DONTCACHE was that it > >> kicks off writeback after every WRITE operation. With NFS, we generally > >> get a COMMIT operation at some point. Allowing us to batch up writes > >> until that point has traditionally been considered better for > >> performance. > > > > I wonder if that traditional consideration is justified, give your > > subsequent results. The addition of COMMIT in v3 allowed us to both: > > - delay kicking off writes > > - not wait for writes to complete > > > > I think the second was always primary. Maybe we didn't consider the > > value of the first enough. > > Obviously the client caches writes and delays the start of writeback. > > Adding another delay on the serve side does not seem to have a clear > > justification. Maybe we *should* kick-off writeback immediately. There > > would still be opportunity for subsequent WRITE requests to be merged > > into the writeback queue. > > Dave Chinner had the same thought a while back. So I've experimented > with starting writes as part of nfsd_write(). Kicking off writes, > even without waiting, is actually pretty costly, and it resulted in > worse performance. Was this with filemap_fdatawrite_range_kick() or something else? > > Now that .pc_release is called /after/ the WRITE response has been sent, > though, that might be a place where kicking off writeback could be done > without charging that latency to the client. Certainly after is better. I was imagining kicking the writeback threads, but they seems to only do whole filesystems. Maybe that is OK? I doubt we want more than one thread kicking off write for any given inode at a time. Maybe it would be best to add a "kicking_write" flag in the filecache and if it is already set, then just add the range to some range of pending-writes. Maybe. I guess I should explore how to test :-) NeilBrown