On Thu, Sep 4, 2025 at 11:48 AM Chuck Lever <cel@xxxxxxxxxx> wrote: > > On 9/4/25 11:43 AM, Olga Kornievskaia wrote: > > On Wed, Sep 3, 2025 at 11:53 AM Chuck Lever <cel@xxxxxxxxxx> wrote: > >> > >> From: Chuck Lever <chuck.lever@xxxxxxxxxx> > >> > >> NFSD has always supported added network listeners. The new netlink > >> protocol now enables the removal of listeners. > >> > >> Olga noticed that if an RDMA listener is removed and immediately > >> re-added, the deferred __svc_rdma_free() function might not have > >> run yet, so some or all of the old listener's RDMA resources > >> linger, which prevents a new listener on the same address from > >> being created. > > > > Does this mean that you prefer to go the way of rdma synchronous > > release vs the patch I posted? > > We could do both. IMO we need to make "remove listener" work while > there are still nfsd threads running, and this RFC patch does > nothing about that. > > But as noted below, it looks like the svc_xprt_free() code path assumes > that ->xpo_free releases all transport resources synchronously, and > there can be consequences if that does not happen. That needs to be > addressed somehow. > > > > I'm not against the approach as I have previously noted it as an > > alternative which I tested and it also solves the problem. But I still > > dont grasp the consequence of making svc_rdma_free() synchronous, > > especially for active transports (not listening sockets). > > I've tested the synchronous approach a little, there didn't seem to > be a problem with it. But I agree, the certainty level is not as > high as it ought to be. So what do you think about including this patch? I don't see it in your nfsd-testing branch. Either this patch or my patch fix an existing problem and I believe would be beneficial to include now (and backport). A solution for removal of listeners on an active server can be worked on top of that. > >> Also, svc_xprt_free() does a module_put() just after calling > >> ->xpo_free(). That means if there is deferred work going on, the > >> module could be unloaded before that work is even started, > >> resulting in a UAF. > >> > >> Neil asks: > >>> What particular part of __svc_rdma_free() needs to run in order for a > >>> subsequent registration to succeed? > >>> Can that bit be run directory from svc_rdma_free() rather than be > >>> delayed? > >>> (I know almost nothing about rdma so forgive me if the answers to these > >>> questions seems obvious) > >> > >> The reasons I can recall are: > >> > >> - Some of the transport tear-down work can sleep > >> - Releasing a cm_id is tricky and can deadlock > >> > >> We might be able to mitigate the second issue with judicious > >> application of transport reference counting. > >> > >> Reported-by: Olga Kornievskaia <okorniev@xxxxxxxxxx> > >> Suggested-by: NeilBrown <neil@xxxxxxxxxx> > >> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > >> --- > >> net/sunrpc/svc_xprt.c | 1 + > >> net/sunrpc/xprtrdma/svc_rdma_transport.c | 19 ++++++++----------- > >> 2 files changed, 9 insertions(+), 11 deletions(-) > >> > >> diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c > >> index 8b1837228799..8526bfc3ab20 100644 > >> --- a/net/sunrpc/svc_xprt.c > >> +++ b/net/sunrpc/svc_xprt.c > >> @@ -168,6 +168,7 @@ static void svc_xprt_free(struct kref *kref) > >> struct svc_xprt *xprt = > >> container_of(kref, struct svc_xprt, xpt_ref); > >> struct module *owner = xprt->xpt_class->xcl_owner; > >> + > >> if (test_bit(XPT_CACHE_AUTH, &xprt->xpt_flags)) > >> svcauth_unix_info_release(xprt); > >> put_cred(xprt->xpt_cred); > >> diff --git a/net/sunrpc/xprtrdma/svc_rdma_transport.c b/net/sunrpc/xprtrdma/svc_rdma_transport.c > >> index 3d7f1413df02..b7b318ad25c4 100644 > >> --- a/net/sunrpc/xprtrdma/svc_rdma_transport.c > >> +++ b/net/sunrpc/xprtrdma/svc_rdma_transport.c > >> @@ -591,12 +591,18 @@ static void svc_rdma_detach(struct svc_xprt *xprt) > >> rdma_disconnect(rdma->sc_cm_id); > >> } > >> > >> -static void __svc_rdma_free(struct work_struct *work) > >> +/** > >> + * svc_rdma_free - Release class-specific transport resources > >> + * @xprt: Generic svc transport object > >> + */ > >> +static void svc_rdma_free(struct svc_xprt *xprt) > >> { > >> struct svcxprt_rdma *rdma = > >> - container_of(work, struct svcxprt_rdma, sc_work); > >> + container_of(xprt, struct svcxprt_rdma, sc_xprt); > >> struct ib_device *device = rdma->sc_cm_id->device; > >> > >> + might_sleep(); > >> + > >> /* This blocks until the Completion Queues are empty */ > >> if (rdma->sc_qp && !IS_ERR(rdma->sc_qp)) > >> ib_drain_qp(rdma->sc_qp); > >> @@ -629,15 +635,6 @@ static void __svc_rdma_free(struct work_struct *work) > >> kfree(rdma); > >> } > >> > >> -static void svc_rdma_free(struct svc_xprt *xprt) > >> -{ > >> - struct svcxprt_rdma *rdma = > >> - container_of(xprt, struct svcxprt_rdma, sc_xprt); > >> - > >> - INIT_WORK(&rdma->sc_work, __svc_rdma_free); > >> - schedule_work(&rdma->sc_work); > >> -} > >> - > >> static int svc_rdma_has_wspace(struct svc_xprt *xprt) > >> { > >> struct svcxprt_rdma *rdma = > >> -- > >> 2.50.0 > >> > > > > > -- > Chuck Lever >