Re: [RFC PATCH 1/1] nfsd: delay re-registering of listeners after listener removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2025-08-21 at 17:01 -0400, Chuck Lever wrote:
> On 8/21/25 4:43 PM, Olga Kornievskaia wrote:
> > This patch tries to address the following failure:
> > nfsdctl threads 0
> > nfsdctl listener +rdma::20049
> > nfsdctl listener +tcp::2049
> > nfsdctl listener -tcp::2049
> > nfsdctl: Error: Cannot assign requested address
> > 
> > The reason for the failure is due to the fact that socket cleanup only
> > happens in __svc_rdma_free() which is a deferred work triggers when an
> > rdma transport is destroyed. To remove a listener nfsdctl is forced to
> > first remove all transports via svc_xprt_destroy_all() and then re-add
> > the ones that are left. Due to the fact that there isn't a way to
> > delete a particular entry from a list where permanent sockets are
> > stored.
> 
> The issue is specifically with llist, which does not permit the
> deletion of any entry other than the first on the list.
> 

Yep. I had a patch that allowed arbitrary removals, but it was pretty
messy.

> 
> > Going back to the deferred work done in __svc_rdma_free(), the
> > work might not get to run before nfsd_nl_listener_set_doit() creates
> > the new transports. As a result, it finds that something is still
> > listening of the rdma port and rdma_bind_addr() fails.
> > 
> > Proposed solution is to add a delay after svc_xprt_destroy_all() to
> > allow for the deferred work to run.
> > 
> > --- Is the chosen value of 1s enough to ensure socket goes away?
> > I can't guarantee that.
> 
> Adding a sleep and hoping it works is ... not a proper fix. The
> msleep() in svc_xprt_destroy_all() is part of a polling loop,
> and it is only waiting for the xprt lists to become empty. You're
> not polling here (ie, checking for completion before sleeping).
> 

Agreed. I think that we need to fix this properly one way or another.

> 
> > --- Alternatives that i can think of:
> > (1) to go back to listener removal approach that added removal of
> > entry to the llist api. That would not require a removal of all
> > transport causing this problem to occur. Earlier it was preferred
> > not to change llist api.

That patch was also just icky. llists are singly-linked so you have to
scan the list to find the previous entry.

> > (2) some method of checking that all deferred work occuring in
> > svc_xprt_destroy_all() completed.
> 
> Jeff (and perhaps Lorenzo) need to go back to the original reasons why
> this was done and rework it. I think we were avoiding holding the
> nfsd mutex in here?
> 

No, the nfsd_mutex is definitely held when svc_xprt_destroy_all() is
called.

> Complete shutdown of a transport always involve some deferred
> activity, and there's a reference count involved as well. I can't
> see how the current destroy/re-insert mechanism can be made reliable.
> 

This is a place where blocking is OK. svc_xprt_destroy_all() already
waits. Why can't it also wait for the RDMA listeners to be completely
torn down?

If that isn't feasible then the only other thing I can think of would
be to change the 'nfsdctl listener' command to not allow the removal of
individual listeners.  With the legacy nfsdfs interfaces, you could
only remove them by writing '0' to the "threads" file. Maybe we could
make nfsdctl work similarly (even if it's extremely unintuitive).

> 
> > Fixes: d093c90892607 ("nfsd: fix management of listener transports")
> > Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx>
> > ---
> >  fs/nfsd/nfsctl.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index dd3267b4c203..f9f5670abcc3 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -1998,8 +1998,10 @@ int nfsd_nl_listener_set_doit(struct sk_buff *skb, struct genl_info *info)
> >  	 * Since we can't delete an arbitrary llist entry, destroy the
> >  	 * remaining listeners and recreate the list.
> >  	 */
> > -	if (delete)
> > +	if (delete) {
> >  		svc_xprt_destroy_all(serv, net, false);
> > +		ssleep(1);
> > +	}
> >  
> >  	/* walk list of addrs again, open any that still don't exist */
> >  	nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> 

-- 
Jeff Layton <jlayton@xxxxxxxxxx>





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux