On Sat, 05 Jul 2025, Jeff Layton wrote: > On Fri, 2025-07-04 at 17:20 +1000, NeilBrown wrote: > > Writing to v4_end_grace can race with server shutdown and result in > > memory being accessed after it was freed - reclaim_str_hashtbl in > > particular. > > > > We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is > > held while client_tracking_op->init() is called and that can wait for > > an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a > > deadlock. > > > > nfsd4_end_grace() is also called by the landromat work queue and this > > doesn't require locking as server shutdown will stop the work and wait > > for it before freeing anything that nfsd4_end_grace() might access. > > > > However, we must be sure that writing to v4_end_grace doesn't restart > > the work item after shutdown has already waited for it. For this we add > > a new flag protected with a spin_lock, and nn->client_lock is suitable. > > It is set only while it is safe to make client tracking calls, and > > v4_end_grace only schedules work while the flag is set and with the > > spin_lock held. > > > > So this patch adds an nfsd_net field "client_tracking_active" which is > > set as described. Another field "grace_end_forced", is set when > > v4_end_grace is written. After this is set, and providing > > client_tracking_active is set, the laundromat is scheduled. > > This "grace_end_forced" field bypasses other checks for whether the > > grace period has finished. > > > > This resolves a race which can result in use-after-free. > > > > Reported-and-tested-by: Li Lingfeng <lilingfeng3@xxxxxxxxxx> > > Closes: https://lore.kernel.org/linux-nfs/20250513074305.3362209-1-lilingfeng3@xxxxxxxxxx > > Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd") > > Cc: stable@xxxxxxxxxxxxxxx > > Signed-off-by: NeilBrown <neil@xxxxxxxxxx> > > --- > > fs/nfsd/netns.h | 2 ++ > > fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++-- > > fs/nfsd/nfsctl.c | 6 +++--- > > fs/nfsd/state.h | 2 +- > > 4 files changed, 49 insertions(+), 6 deletions(-) > > > > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h > > index 3e2d0fde80a7..fe8338735e7c 100644 > > --- a/fs/nfsd/netns.h > > +++ b/fs/nfsd/netns.h > > @@ -66,6 +66,8 @@ struct nfsd_net { > > > > struct lock_manager nfsd4_manager; > > bool grace_ended; > > + bool grace_end_forced; > > + bool client_tracking_active; > > ISTM that the client_tracking_active bool is set and cleared similarly > to the nn->client_tracking_ops pointer itself. It _might_ be possible > to eliminate this bool and just use that pointer instead, though they > are not exactly cleared at the same time. Yes it might. We currently set nn->client_tracking_ops before calling nn->client_tracking_ops->init(), and only clear it *after* calling nn->client_tracking_ops->exit(). If the ->init and ->exit functions never need nn->client_tracking_ops (and I don't think they do) then we could set it after a successful init, and clear it before calling ->exit. Then we could use it as the flag. We would need the exit path to be spinlock() ops = xchg(nn->client_tracking_ops, NULL); spinunlock() cancel_delayed_work_sync() ops->exit() which is a little less abstraction than I would like, but should be ok. This might be a useful simplification but I don't think we should try it before submitting the fix. Thanks for the suggestion and for the review. NeilBrown