On Tue, 09 Sep 2025, Scott Mayhew wrote: > On Sat, 06 Sep 2025, NeilBrown wrote: > > > On Sat, 06 Sep 2025, Scott Mayhew wrote: > > > In 91da135f ("systemd unit files: fix up dependencies on rpcbind"), > > > Neil laid out the rationale for how the nfs services should define their > > > dependencies on rpcbind. In a nutshell: > > > > > > 1. Dependencies should only be defined using rpcbind.socket > > > 2. Ordering for dependencies should only be defined usint "After=" > > > 3. nfs-server.service should use "Wants=rpcbind.socket", to allow > > > rpcbind.socket to be masked in NFSv4-only setups. > > > 4. rpc-statd.service should use "Requires=rpcbind.socket", as rpc.statd > > > is useless if it can't register with rpcbind. > > > > > > Then in https://bugzilla.redhat.com/show_bug.cgi?id=2100395, Ben noted > > > that due to the way the dependencies are ordered, when 'systemctl stop > > > rpcbind.socket' is run, systemd first sends SIGTERM to rpcbind, then > > > SIGTERM to rpc.statd. On SIGTERM, rpcbind tears down /var/run/rpcbind.sock. > > > However, rpc-statd on SIGTERM attempts to unregister from rpcbind. This > > > results in a long delay: > > > > > > [root@rawhide ~]# time systemctl restart rpcbind.socket > > > > > > real 1m0.147s > > > user 0m0.004s > > > sys 0m0.003s > > > > > > 8a835ceb ("rpc-statd.service: Stop rpcbind and rpc.stat in an exit race") > > > fixed this by changing the dependency in rpc-statd.service to use > > > "After=rpcbind.service", bending rule #1 from above. > > > > Thanks for the thorough and detailed explanation. > > > > I'd like to suggest a different fix. Change rpc-statd.service to > > declare: > > > > After=network-online.target nss-lookup.target rpcbind.socket rpcbind.service > > > > i.e. it is declared to be After both the socket and the service. > > > > "After" declarations only have effect if the units are in the same > > transaction. If the Unit is not being started or stopped, the After > > declaration has no effect. > > > > So on startup, this will ensure rpcbind.socket is started before > > rpc-statd.service. > > On shutdown in a transaction that stops both rpc-statd.service and > > rpcbind.service, rpcbind.service won't be stopped until after > > rpc-statd.service is stopped. > > That works too. > > > > > I agree that it isn't necessary to restart rpc-statd when rpcbind is > > restarted. > > Maybe that is a justification to use Wants instead of Requires. > > Or maybe Upholds would be even better. > > I think Upholds is confusing.... especially since there aren't any > existing unit files using it, at least on a stock Fedora Rawhide > system. I don't see it being used on OpenSUSE Tumbleweed or Debian > Trixie either. I think it's going to confuse users if they try to stop > rpcbind.socket and then find that it's still running. Finally, when I tested > it, it prevented me from stopping rpc-statd. Eventually the shutdown > timer hit and systemd sent rpc-statd a SIGABRT, which in turned > triggered the systemd-coredump handler. That's a whole mess of syslog > entries that's going to more bug reports. I'd rather stick with Wants. Thanks for testing. I've never used Upholds myself but I was reading the man page any wondered if it might be a good fit. Apparently it isn't. > > > > > I wonder if putting > > > > ConditionPathIsSymbolisLink !/etc/systemd/system/rpcbind.socket > > I'm lost. What what cause the rpcbind.socket symlink to be created > directly in /etc/systemd/system? I've seen it get created in > /etc/systemd/system/sockets.target.wants or > /etc/systemd/system/multi-user.target.wants, but never directly in > /etc/systemd/system. $ sudo systemctl mask rpcbind.socket Created symlink '/etc/systemd/system/rpcbind.socket' → '/dev/null'. I'm not sure it's a good idea. I was mostly thinking aloud. I think the After line should be changed to include both .service and .socket and that cleanly fixed just the observed problem about shutdown. And I don't think any other change should be made without a clear demonstrated need. Thanks, NeilBrown > > -Scott > > > > in rpc-statd.service would be a suitable way to stop rpc-statd from > > starting if rpcbind.socket is masked. > > > > In any case I think there are two separate issues here which deserve two > > separate patch. > > 1/ shutdown ordering isn't handled correctly. Adding the extra After > > directive should fix that > > 2/ rpc.statd is restarted unnecessarily. Wants or Upholds might be part > > of the solution. > > > > Thanks, > > NeilBrown > > > > > > > > > > > > Yongcheng recently noted that when runnnig the following test: > > > > > > [root@rawhide ~]# for i in `seq 10`; do systemctl reset-failed; \ > > > systemctl stop rpcbind rpcbind.socket ; systemctl restart nfs-server ; \ > > > systemctl status rpc-statd; done > > > > > > rpc-statd.service would often fail to start: > > > > > > × rpc-statd.service - NFS status monitor for NFSv2/3 locking. > > > Loaded: loaded (/usr/lib/systemd/system/rpc-statd.service; enabled-runtime; preset: disabled) > > > Drop-In: /usr/lib/systemd/system/service.d > > > └─10-timeout-abort.conf > > > Active: failed (Result: exit-code) since Fri 2025-09-05 18:01:15 EDT; 229ms ago > > > Duration: 228ms > > > Invocation: bafb2bb00761439ebc348000704e8fbb > > > Docs: man:rpc.statd(8) > > > Process: 29937 ExecStart=/usr/sbin/rpc.statd (code=exited, status=1/FAILURE) > > > Mem peak: 1.5M > > > CPU: 7ms > > > > > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Version 2.8.2 starting > > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Flags: TI-RPC > > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, udp): svc_reg() err: RPC: Remote system error - Connection refused > > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, tcp): svc_reg() err: RPC: Success > > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, udp6): svc_reg() err: RPC: Success > > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, tcp6): svc_reg() err: RPC: Success > > > Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: failed to create RPC listeners, exiting > > > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: rpc-statd.service: Control process exited, code=exited, status=1/FAILURE > > > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: rpc-statd.service: Failed with result 'exit-code'. > > > Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: Failed to start rpc-statd.service - NFS status monitor for NFSv2/3 locking.. > > > > > > I propose we revert the change from 8a835ceb and instead turn the > > > dependency into a weak dependency by using "Wants=rpcbind.socket" > > > instead of "Requires=rpcbind.socket". This bends rule #4 above and will > > > make it so that systemd will try to start rpcbind.socket if it isn't > > > already running when rpc-statd.service starts, but it won't restart > > > rpc-statd.service whenever rpcbind is restarted. Frankly, we shouldn't > > > need to restart services whenever rpcbind is restarted (thats why > > > rpcbind has the warmstart feature). The only drawback is that now if an > > > admin wants to set up an NFSv4-only server by masking rpcbind.socket, > > > they'll need to mask rpc-statd.service as well. I don't think that's > > > too much to ask, so the nfs.systemd man page has been updated > > > accordingly. > > > > > > Signed-off-by: Scott Mayhew <smayhew@xxxxxxxxxx> > > > --- > > > systemd/nfs.systemd.man | 10 +++++++--- > > > systemd/rpc-statd.service | 5 +++-- > > > 2 files changed, 10 insertions(+), 5 deletions(-) > > > > > > diff --git a/systemd/nfs.systemd.man b/systemd/nfs.systemd.man > > > index a8476038..93fb87cd 100644 > > > --- a/systemd/nfs.systemd.man > > > +++ b/systemd/nfs.systemd.man > > > @@ -137,7 +137,9 @@ NFSv2) and does not want > > > .I rpcbind > > > to be running, the correct approach is to run > > > .RS > > > -.B systemctl mask rpcbind > > > +.B systemctl mask rpcbind.socket > > > +.br > > > +.B systemctl mask rpc-statd.service > > > .RE > > > This will disable > > > .IR rpcbind , > > > @@ -145,9 +147,11 @@ and the various NFS services which depend on it (and are only needed > > > for NFSv3) will refuse to start, without interfering with the > > > operation of NFSv4 services. In particular, > > > .I rpc.statd > > > -will not run when > > > +will fail to start when > > > .I rpcbind > > > -is masked. > > > +is masked, so > > > +.I rpc-statd.service > > > +should be masked as well. > > > .PP > > > .I idmapd > > > is only needed for NFSv4, and even then is not needed when the client > > > diff --git a/systemd/rpc-statd.service b/systemd/rpc-statd.service > > > index 660ed861..4e138f69 100644 > > > --- a/systemd/rpc-statd.service > > > +++ b/systemd/rpc-statd.service > > > @@ -3,10 +3,11 @@ Description=NFS status monitor for NFSv2/3 locking. > > > Documentation=man:rpc.statd(8) > > > DefaultDependencies=no > > > Conflicts=umount.target > > > -Requires=nss-lookup.target rpcbind.socket > > > +Requires=nss-lookup.target > > > +Wants=rpcbind.socket > > > Wants=network-online.target > > > Wants=rpc-statd-notify.service > > > -After=network-online.target nss-lookup.target rpcbind.service > > > +After=network-online.target nss-lookup.target rpcbind.socket > > > > > > PartOf=nfs-utils.service > > > IgnoreOnIsolate=yes > > > -- > > > 2.50.1 > > > > > > > > > > > >