Re: [RFC PATCH 1/1] nfsd: delay re-registering of listeners after listener removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 22 Aug 2025, Chuck Lever wrote:
> On 8/21/25 4:43 PM, Olga Kornievskaia wrote:
> > This patch tries to address the following failure:
> > nfsdctl threads 0
> > nfsdctl listener +rdma::20049
> > nfsdctl listener +tcp::2049
> > nfsdctl listener -tcp::2049
> > nfsdctl: Error: Cannot assign requested address
> > 
> > The reason for the failure is due to the fact that socket cleanup only
> > happens in __svc_rdma_free() which is a deferred work triggers when an
> > rdma transport is destroyed. To remove a listener nfsdctl is forced to
> > first remove all transports via svc_xprt_destroy_all() and then re-add
> > the ones that are left. Due to the fact that there isn't a way to
> > delete a particular entry from a list where permanent sockets are
> > stored.
> 
> The issue is specifically with llist, which does not permit the
> deletion of any entry other than the first on the list.

That is true but not really relevant.
We can remove everything, walk the resulting list removing individual
things, an requeue everything we didn't want to remove.
This is exactly what svc_clean_up_xprts() does.
To remove a signle xprt We just need to set XPT_CLOSE, call
svc_delete_xprt() then run svc_clean_up_xprts().

I think svc_clean_up_xprts() should call svc_pool_wake_idle_thread()
after lwq_enqueue_batch() in case one of the xprts that we kept became
ready while it was temporarily not on the queue.

Thanks,
NeilBrown

> 
> 
> > Going back to the deferred work done in __svc_rdma_free(), the
> > work might not get to run before nfsd_nl_listener_set_doit() creates
> > the new transports. As a result, it finds that something is still
> > listening of the rdma port and rdma_bind_addr() fails.
> > 
> > Proposed solution is to add a delay after svc_xprt_destroy_all() to
> > allow for the deferred work to run.
> > 
> > --- Is the chosen value of 1s enough to ensure socket goes away?
> > I can't guarantee that.
> 
> Adding a sleep and hoping it works is ... not a proper fix. The
> msleep() in svc_xprt_destroy_all() is part of a polling loop,
> and it is only waiting for the xprt lists to become empty. You're
> not polling here (ie, checking for completion before sleeping).
> 
> 
> > --- Alternatives that i can think of:
> > (1) to go back to listener removal approach that added removal of
> > entry to the llist api. That would not require a removal of all
> > transport causing this problem to occur. Earlier it was preferred
> > not to change llist api.
> > (2) some method of checking that all deferred work occuring in
> > svc_xprt_destroy_all() completed.
> 
> Jeff (and perhaps Lorenzo) need to go back to the original reasons why
> this was done and rework it. I think we were avoiding holding the
> nfsd mutex in here?
> 
> Complete shutdown of a transport always involve some deferred
> activity, and there's a reference count involved as well. I can't
> see how the current destroy/re-insert mechanism can be made reliable.
> 
> 
> > Fixes: d093c90892607 ("nfsd: fix management of listener transports")
> > Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx>
> > ---
> >  fs/nfsd/nfsctl.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index dd3267b4c203..f9f5670abcc3 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -1998,8 +1998,10 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
> >  	 * Since we can't delete an arbitrary llist entry, destroy the
> >  	 * remaining listeners and recreate the list.
> >  	 */
> > -	if (delete)
> > +	if (delete) {
> >  		svc_xprt_destroy_all(serv, net, false);
> > +		ssleep(1);
> > +	}
> >  
> >  	/* walk list of addrs again, open any that still don't exist */
> >  	nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> 
> 
> -- 
> Chuck Lever
> 






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux