[ responding to both Neil and Dave ... ] On 2/12/25 5:16 PM, NeilBrown wrote: > On Tue, 11 Feb 2025, Dave Chinner wrote: >> On Fri, Feb 07, 2025 at 04:15:11PM +1100, NeilBrown wrote: >>> The race that this code tries to protect against is not interesting. >>> The code is problematic as we access the "nf" after we have given our >>> reference to the lru system. While that take 2+ seconds to free things >>> it is still poor form. >>> >>> The only interesting race I can find would be with >>> nfsd_file_close_inode_sync(); >>> This is the only place that really doesn't want the file to stay on the >>> LRU when unhashed (which is the direct consequence of the race). >>> >>> However for the race to happen, some other thread must own a reference >>> to a file and be putting in while nfsd_file_close_inode_sync() is trying >>> to close all files for an inode. If this is possible, that other thread >>> could simply call nfsd_file_put() a little bit later and the result >>> would be the same: not all files are closed when >>> nfsd_file_close_inode_sync() completes. >>> >>> If this was really a problem, we would need to wait in close_inode_sync >>> for the other references to be dropped. We probably don't want to do >>> that. >>> >>> So it is best to simply remove this code. >>> >>> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx> >>> Signed-off-by: NeilBrown <neilb@xxxxxxx> >>> --- >>> fs/nfsd/filecache.c | 16 +++------------- >>> 1 file changed, 3 insertions(+), 13 deletions(-) >>> >>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c >>> index a1cdba42c4fa..b13255bcbb96 100644 >>> --- a/fs/nfsd/filecache.c >>> +++ b/fs/nfsd/filecache.c >>> @@ -371,20 +371,10 @@ nfsd_file_put(struct nfsd_file *nf) >>> >>> /* Try to add it to the LRU. If that fails, decrement. */ >>> if (nfsd_file_lru_add(nf)) { >>> - /* If it's still hashed, we're done */ >>> - if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) { >>> - nfsd_file_schedule_laundrette(); >>> - return; >>> - } >>> - >>> - /* >>> - * We're racing with unhashing, so try to remove it from >>> - * the LRU. If removal fails, then someone else already >>> - * has our reference. >>> - */ >>> - if (!nfsd_file_lru_remove(nf)) >>> - return; >>> + nfsd_file_schedule_laundrette(); >>> + return; >> >> Why do we need to start the background gc on demand? This call is an original feature of the filecache -- I'm guessing it was done to ensure that a final put closes the file quickly. But over time, the filecache has grown explicit code that handles that, so I think it is likely this call is no longer needed. >> When the nfsd >> subsystem is under load it is going to be calling >> nfsd_file_schedule_laundrette() all the time. However, gc is almost >> always going to be queued/running already. >> >> i.e. the above code results in adding the overhead of an atomic >> NFSD_FILE_CACHE_UP bit check and then a call to queue_delayed_work() >> on an already queued item. This will disables interrupts, make an >> atomic bit check that sees the work is queued, re-enable interrupts >> and return. > > I don't think we really need the NFSD_FILE_CACHE_UP test - if we have a > file to put, then the cache must be up. I can change 1/6 to remove the call to nfsd_file_schedule_laundrette(). Then these questions become moot. > And we could use delayed_work_pending() to avoid the irq disable. > That is still one atomic bit test though. > >> >> IMO, there is no need to do this unnecessary work on every object >> that is added to the LRU. Changing the gc worker to always run >> every 2s and check if it has work to do like so: >> >> static void >> nfsd_file_gc_worker(struct work_struct *work) >> { >> - nfsd_file_gc(); >> - if (list_lru_count(&nfsd_file_lru)) >> - nfsd_file_schedule_laundrette(); >> + if (list_lru_count(&nfsd_file_lru)) >> + nfsd_file_gc(); >> + nfsd_file_schedule_laundrette(); >> } >> >> means that nfsd_file_gc() will be run the same way and have the same >> behaviour as the current code. When the system it idle, it does a >> list_lru_count() check every 2 seconds and goes back to sleep. >> That's going to be pretty much unnoticable on most machines that run >> NFS servers. I can add a patch to this series that makes this swap. > When serving a v4 only load we don't need the timer at all... Maybe > that doesn't matter. > > I would rather use a timer instead of a delayed work as the task doesn't > require sleeping, and we have a pool of nfsd threads to do any work that > might be needed. So a timer that checks if work is needed then sets a > flag and wakes an nfsd would be even lower impact that a delayed work > which needs to wake a wq thread every time even if there is nothing to > do. Moving the laundrette work to nfsd threads seems sensible, but let's leave that for another patch series. -- Chuck Lever