[nfs-utils PATCH] rpc-statd.service: weaken the dependency on rpcbind.socket

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In 91da135f ("systemd unit files: fix up dependencies on rpcbind"),
Neil laid out the rationale for how the nfs services should define their
dependencies on rpcbind.  In a nutshell:

1. Dependencies should only be defined using rpcbind.socket
2. Ordering for dependencies should only be defined usint "After="
3. nfs-server.service should use "Wants=rpcbind.socket", to allow
   rpcbind.socket to be masked in NFSv4-only setups.
4. rpc-statd.service should use "Requires=rpcbind.socket", as rpc.statd
   is useless if it can't register with rpcbind.

Then in https://bugzilla.redhat.com/show_bug.cgi?id=2100395, Ben noted
that due to the way the dependencies are ordered, when 'systemctl stop
rpcbind.socket' is run, systemd first sends SIGTERM to rpcbind, then
SIGTERM to rpc.statd.  On SIGTERM, rpcbind tears down /var/run/rpcbind.sock.
However, rpc-statd on SIGTERM attempts to unregister from rpcbind.  This
results in a long delay:

[root@rawhide ~]# time systemctl restart rpcbind.socket

real	1m0.147s
user	0m0.004s
sys	0m0.003s

8a835ceb ("rpc-statd.service: Stop rpcbind and rpc.stat in an exit race")
fixed this by changing the dependency in rpc-statd.service to use
"After=rpcbind.service", bending rule #1 from above.

Yongcheng recently noted that when runnnig the following test:

[root@rawhide ~]# for i in `seq 10`; do systemctl reset-failed; \
	systemctl stop rpcbind rpcbind.socket ; systemctl restart nfs-server ; \
	systemctl status rpc-statd; done

rpc-statd.service would often fail to start:

× rpc-statd.service - NFS status monitor for NFSv2/3 locking.
     Loaded: loaded (/usr/lib/systemd/system/rpc-statd.service; enabled-runtime; preset: disabled)
    Drop-In: /usr/lib/systemd/system/service.d
             └─10-timeout-abort.conf
     Active: failed (Result: exit-code) since Fri 2025-09-05 18:01:15 EDT; 229ms ago
   Duration: 228ms
 Invocation: bafb2bb00761439ebc348000704e8fbb
       Docs: man:rpc.statd(8)
    Process: 29937 ExecStart=/usr/sbin/rpc.statd (code=exited, status=1/FAILURE)
   Mem peak: 1.5M
        CPU: 7ms

Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Version 2.8.2 starting
Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Flags: TI-RPC
Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, udp): svc_reg() err: RPC: Remote system error - Connection refused
Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, tcp): svc_reg() err: RPC: Success
Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, udp6): svc_reg() err: RPC: Success
Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: Failed to register (statd, 1, tcp6): svc_reg() err: RPC: Success
Sep 05 18:01:15 rawhide.smayhew.test rpc.statd[29938]: failed to create RPC listeners, exiting
Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: rpc-statd.service: Control process exited, code=exited, status=1/FAILURE
Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: rpc-statd.service: Failed with result 'exit-code'.
Sep 05 18:01:15 rawhide.smayhew.test systemd[1]: Failed to start rpc-statd.service - NFS status monitor for NFSv2/3 locking..

I propose we revert the change from 8a835ceb and instead turn the
dependency into a weak dependency by using "Wants=rpcbind.socket"
instead of "Requires=rpcbind.socket".  This bends rule #4 above and will
make it so that systemd will try to start rpcbind.socket if it isn't
already running when rpc-statd.service starts, but it won't restart
rpc-statd.service whenever rpcbind is restarted.  Frankly, we shouldn't
need to restart services whenever rpcbind is restarted (thats why
rpcbind has the warmstart feature).  The only drawback is that now if an
admin wants to set up an NFSv4-only server by masking rpcbind.socket,
they'll need to mask rpc-statd.service as well.  I don't think that's
too much to ask, so the nfs.systemd man page has been updated
accordingly.

Signed-off-by: Scott Mayhew <smayhew@xxxxxxxxxx>
---
 systemd/nfs.systemd.man   | 10 +++++++---
 systemd/rpc-statd.service |  5 +++--
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/systemd/nfs.systemd.man b/systemd/nfs.systemd.man
index a8476038..93fb87cd 100644
--- a/systemd/nfs.systemd.man
+++ b/systemd/nfs.systemd.man
@@ -137,7 +137,9 @@ NFSv2) and does not want
 .I rpcbind
 to be running, the correct approach is to run
 .RS
-.B systemctl mask rpcbind
+.B systemctl mask rpcbind.socket
+.br
+.B systemctl mask rpc-statd.service
 .RE
 This will disable
 .IR rpcbind ,
@@ -145,9 +147,11 @@ and the various NFS services which depend on it (and are only needed
 for NFSv3) will refuse to start, without interfering with the
 operation of NFSv4 services.  In particular,
 .I rpc.statd
-will not run when
+will fail to start when
 .I rpcbind
-is masked.
+is masked, so
+.I rpc-statd.service
+should be masked as well.
 .PP
 .I idmapd
 is only needed for NFSv4, and even then is not needed when the client
diff --git a/systemd/rpc-statd.service b/systemd/rpc-statd.service
index 660ed861..4e138f69 100644
--- a/systemd/rpc-statd.service
+++ b/systemd/rpc-statd.service
@@ -3,10 +3,11 @@ Description=NFS status monitor for NFSv2/3 locking.
 Documentation=man:rpc.statd(8)
 DefaultDependencies=no
 Conflicts=umount.target
-Requires=nss-lookup.target rpcbind.socket
+Requires=nss-lookup.target
+Wants=rpcbind.socket
 Wants=network-online.target
 Wants=rpc-statd-notify.service
-After=network-online.target nss-lookup.target rpcbind.service
+After=network-online.target nss-lookup.target rpcbind.socket
 
 PartOf=nfs-utils.service
 IgnoreOnIsolate=yes
-- 
2.50.1





[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux