Re: [PATCH 1/2] nfsd: provide locking for v4_end_grace

"NeilBrown" <neil@xxxxxxxxxx> · Sun, 06 Jul 2025 15:56:31 +1000

On Sat, 05 Jul 2025, Jeff Layton wrote:
> On Fri, 2025-07-04 at 17:20 +1000, NeilBrown wrote:
> > Writing to v4_end_grace can race with server shutdown and result in
> > memory being accessed after it was freed - reclaim_str_hashtbl in
> > particular.
> > 
> > We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is
> > held while client_tracking_op->init() is called and that can wait for
> > an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a
> > deadlock.
> > 
> > nfsd4_end_grace() is also called by the landromat work queue and this
> > doesn't require locking as server shutdown will stop the work and wait
> > for it before freeing anything that nfsd4_end_grace() might access.
> > 
> > However, we must be sure that writing to v4_end_grace doesn't restart
> > the work item after shutdown has already waited for it.  For this we add
> > a new flag protected with a spin_lock, and nn->client_lock is suitable.
> > It is set only while it is safe to make client tracking calls, and
> > v4_end_grace only schedules work while the flag is set and with the
> > spin_lock held.
> > 
> > So this patch adds an nfsd_net field "client_tracking_active" which is
> > set as described.  Another field "grace_end_forced", is set when
> > v4_end_grace is written.  After this is set, and providing
> > client_tracking_active is set, the laundromat is scheduled.
> > This "grace_end_forced" field bypasses other checks for whether the
> > grace period has finished.
> > 
> > This resolves a race which can result in use-after-free.
> > 
> > Reported-and-tested-by: Li Lingfeng <lilingfeng3@xxxxxxxxxx>
> > Closes: https://lore.kernel.org/linux-nfs/20250513074305.3362209-1-lilingfeng3@xxxxxxxxxx
> > Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: NeilBrown <neil@xxxxxxxxxx>
> > ---
> >  fs/nfsd/netns.h     |  2 ++
> >  fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++--
> >  fs/nfsd/nfsctl.c    |  6 +++---
> >  fs/nfsd/state.h     |  2 +-
> >  4 files changed, 49 insertions(+), 6 deletions(-)
> > 
> > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > index 3e2d0fde80a7..fe8338735e7c 100644
> > --- a/fs/nfsd/netns.h
> > +++ b/fs/nfsd/netns.h
> > @@ -66,6 +66,8 @@ struct nfsd_net {
> >  
> >  	struct lock_manager nfsd4_manager;
> >  	bool grace_ended;
> > +	bool grace_end_forced;
> > +	bool client_tracking_active;
> 
> ISTM that the client_tracking_active bool is set and cleared similarly
> to the nn->client_tracking_ops pointer itself. It _might_ be possible
> to eliminate this bool and just use that pointer instead, though they
> are not exactly cleared at the same time.

Yes it might.
We currently set nn->client_tracking_ops before calling
nn->client_tracking_ops->init(),
and only clear it *after* calling nn->client_tracking_ops->exit().
If the ->init and ->exit functions never need nn->client_tracking_ops
(and I don't think they do) then we could set it after a successful
init, and clear it before calling ->exit.  Then we could use it as the
flag. We would need the exit path to be
  spinlock()
  ops = xchg(nn->client_tracking_ops, NULL);
  spinunlock()
  cancel_delayed_work_sync()
  ops->exit()

which is a little less abstraction than I would like, but should be ok.

This might be a useful simplification but I don't think we should try it
before submitting the fix.

Thanks for the suggestion and for the review.

NeilBrown