On Sat, 06 Sep 2025, NeilBrown wrote: > On Sat, 06 Sep 2025, Scott Mayhew wrote: > > In 91da135f ("systemd unit files: fix up dependencies on rpcbind"), > > Neil laid out the rationale for how the nfs services should define their > > dependencies on rpcbind. In a nutshell: > > > > 1. Dependencies should only be defined using rpcbind.socket > > 2. Ordering for dependencies should only be defined usint "After=" > > 3. nfs-server.service should use "Wants=rpcbind.socket", to allow > > rpcbind.socket to be masked in NFSv4-only setups. > > 4. rpc-statd.service should use "Requires=rpcbind.socket", as rpc.statd > > is useless if it can't register with rpcbind. > > > > Then in https://bugzilla.redhat.com/show_bug.cgi?id=2100395, Ben noted > > that due to the way the dependencies are ordered, when 'systemctl stop > > rpcbind.socket' is run, systemd first sends SIGTERM to rpcbind, then > > SIGTERM to rpc.statd. On SIGTERM, rpcbind tears down /var/run/rpcbind.sock. > > However, rpc-statd on SIGTERM attempts to unregister from rpcbind. This > > results in a long delay: > > > > [root@rawhide ~]# time systemctl restart rpcbind.socket > > > > real 1m0.147s > > user 0m0.004s > > sys 0m0.003s > > > > 8a835ceb ("rpc-statd.service: Stop rpcbind and rpc.stat in an exit race") > > fixed this by changing the dependency in rpc-statd.service to use > > "After=rpcbind.service", bending rule #1 from above. > > Thanks for the thorough and detailed explanation. > > I'd like to suggest a different fix. Change rpc-statd.service to > declare: > > After=network-online.target nss-lookup.target rpcbind.socket rpcbind.service > > i.e. it is declared to be After both the socket and the service. > > "After" declarations only have effect if the units are in the same > transaction. If the Unit is not being started or stopped, the After > declaration has no effect. > > So on startup, this will ensure rpcbind.socket is started before > rpc-statd.service. > On shutdown in a transaction that stops both rpc-statd.service and > rpcbind.service, rpcbind.service won't be stopped until after > rpc-statd.service is stopped. That works too. > > I agree that it isn't necessary to restart rpc-statd when rpcbind is > restarted. > Maybe that is a justification to use Wants instead of Requires. > Or maybe Upholds would be even better. I think Upholds is confusing.... especially since there aren't any existing unit files using it, at least on a stock Fedora Rawhide system. I don't see it being used on OpenSUSE Tumbleweed or Debian Trixie either. I think it's going to confuse users if they try to stop rpcbind.socket and then find that it's still running. Finally, when I tested it, it prevented me from stopping rpc-statd. Eventually the shutdown timer hit and systemd sent rpc-statd a SIGABRT, which in turned triggered the systemd-coredump handler. That's a whole mess of syslog entries that's going to more bug reports. I'd rather stick with Wants. > > I wonder if putting > > ConditionPathIsSymbolisLink !/etc/systemd/system/rpcbind.socket I'm lost. What what cause the rpcbind.socket symlink to be created directly in /etc/systemd/system? I've seen it get created in /etc/systemd/system/sockets.target.wants or /etc/systemd/system/multi-user.target.wants, but never directly in /etc/systemd/system. -Scott > > in rpc-statd.service would be a suitable way to stop rpc-statd from > starting if rpcbind.socket is masked. > > In any case I think there are two separate issues here which deserve two > separate patch. > 1/ shutdown ordering isn't handled correctly. Adding the extra After > directive should fix that > 2/ rpc.statd is restarted unnecessarily. Wants or Upholds might be part > of the solution. > > Thanks, > NeilBrown > > > > > > > Yongcheng recently noted that when runnnig the following test: > > > > [root@rawhide ~]# for i in `seq 10`; do systemctl reset-failed; \ > > systemctl stop rpcbind rpcbind.socket ; systemctl restart nfs-server ; \ > > systemctl status rpc-statd; done > > > > rpc-statd.service would often fail to start: > > > > × rpc-statd.service - NFS status monitor for NFSv2/3 locking. > > Loaded: loaded (/usr/lib/systemd/system/rpc-statd.service; enabled-runtime; preset: disabled) > > Drop-In: /usr/lib/systemd/system/service.d > > └─10-timeout-abort.conf > > Active: failed (Result: exit-code) since Fri 2025-09-05 18:01:15 EDT; 229ms ago > > Duration: 228ms > > Invocation: bafb2bb00761439ebc348000704e8fbb > > Docs: man:rpc.statd(8) > > Process: 29937 ExecStart=/usr/sbin/rpc.statd (code=exited, status=1/FAILURE) > > Mem peak: 1.5M > > CPU: 7ms > > > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Version 2.8.2 starting > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Flags: TI-RPC > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, udp): svc_reg() err: RPC: Remote system error - Connection refused > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, tcp): svc_reg() err: RPC: Success > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, udp6): svc_reg() err: RPC: Success > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, tcp6): svc_reg() err: RPC: Success > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: failed to create RPC listeners, exiting > > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: rpc-statd.service: Control process exited, code=exited, status=1/FAILURE > > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: rpc-statd.service: Failed with result 'exit-code'. > > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: Failed to start rpc-statd.service - NFS status monitor for NFSv2/3 locking.. > > > > I propose we revert the change from 8a835ceb and instead turn the > > dependency into a weak dependency by using "Wants=rpcbind.socket" > > instead of "Requires=rpcbind.socket". This bends rule #4 above and will > > make it so that systemd will try to start rpcbind.socket if it isn't > > already running when rpc-statd.service starts, but it won't restart > > rpc-statd.service whenever rpcbind is restarted. Frankly, we shouldn't > > need to restart services whenever rpcbind is restarted (thats why > > rpcbind has the warmstart feature). The only drawback is that now if an > > admin wants to set up an NFSv4-only server by masking rpcbind.socket, > > they'll need to mask rpc-statd.service as well. I don't think that's > > too much to ask, so the nfs.systemd man page has been updated > > accordingly. > > > > Signed-off-by: Scott Mayhew <smayhew@xxxxxxxxxx> > > --- > > systemd/nfs.systemd.man | 10 +++++++--- > > systemd/rpc-statd.service | 5 +++-- > > 2 files changed, 10 insertions(+), 5 deletions(-) > > > > diff --git a/systemd/nfs.systemd.man b/systemd/nfs.systemd.man > > index a8476038..93fb87cd 100644 > > --- a/systemd/nfs.systemd.man > > +++ b/systemd/nfs.systemd.man > > @@ -137,7 +137,9 @@ NFSv2) and does not want > > .I rpcbind > > to be running, the correct approach is to run > > .RS > > -.B systemctl mask rpcbind > > +.B systemctl mask rpcbind.socket > > +.br > > +.B systemctl mask rpc-statd.service > > .RE > > This will disable > > .IR rpcbind , > > @@ -145,9 +147,11 @@ and the various NFS services which depend on it (and are only needed > > for NFSv3) will refuse to start, without interfering with the > > operation of NFSv4 services. In particular, > > .I rpc.statd > > -will not run when > > +will fail to start when > > .I rpcbind > > -is masked. > > +is masked, so > > +.I rpc-statd.service > > +should be masked as well. > > .PP > > .I idmapd > > is only needed for NFSv4, and even then is not needed when the client > > diff --git a/systemd/rpc-statd.service b/systemd/rpc-statd.service > > index 660ed861..4e138f69 100644 > > --- a/systemd/rpc-statd.service > > +++ b/systemd/rpc-statd.service > > @@ -3,10 +3,11 @@ Description=NFS status monitor for NFSv2/3 locking. > > Documentation=man:rpc.statd(8) > > DefaultDependencies=no > > Conflicts=umount.target > > -Requires=nss-lookup.target rpcbind.socket > > +Requires=nss-lookup.target > > +Wants=rpcbind.socket > > Wants=network-online.target > > Wants=rpc-statd-notify.service > > -After=network-online.target nss-lookup.target rpcbind.service > > +After=network-online.target nss-lookup.target rpcbind.socket > > > > PartOf=nfs-utils.service > > IgnoreOnIsolate=yes > > -- > > 2.50.1 > > > > > >