On 25 Mar 2025, at 20:37, Trond Myklebust wrote: > On Tue, 2025-03-25 at 20:15 -0400, Jeff Layton wrote: >> On Tue, 2025-03-25 at 18:35 -0400, trondmy@xxxxxxxxxx wrote: >>> From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >>> >>> The nfs_client manages state for all the superblocks in the >>> "cl_superblocks" list, so it must not be shut down until all of >>> them are >>> gone. >>> >>> Fixes: 7d3e26a054c8 ("NFS: Cancel all existing RPC tasks when >>> shutdown") >>> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> >>> --- >>> fs/nfs/sysfs.c | 22 +++++++++++++++++++++- >>> 1 file changed, 21 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/nfs/sysfs.c b/fs/nfs/sysfs.c >>> index b30401b2c939..37cb2b776435 100644 >>> --- a/fs/nfs/sysfs.c >>> +++ b/fs/nfs/sysfs.c >>> @@ -14,6 +14,7 @@ >>> #include <linux/rcupdate.h> >>> #include <linux/lockd/lockd.h> >>> >>> +#include "internal.h" >>> #include "nfs4_fs.h" >>> #include "netns.h" >>> #include "sysfs.h" >>> @@ -228,6 +229,25 @@ static void shutdown_client(struct rpc_clnt >>> *clnt) >>> rpc_cancel_tasks(clnt, -EIO, shutdown_match_client, NULL); >>> } >>> >>> +/* >>> + * Shut down the nfs_client only once all the superblocks >>> + * have been shut down. >>> + */ >>> +static void shutdown_nfs_client(struct nfs_client *clp) >>> +{ >>> + struct nfs_server *server; >>> + rcu_read_lock(); >>> + list_for_each_entry_rcu(server, &clp->cl_superblocks, >>> client_link) { >>> + if (!(server->flags & NFS_MOUNT_SHUTDOWN)) { >>> + rcu_read_unlock(); >>> + return; >>> + } >>> + } >>> + rcu_read_unlock(); >>> + nfs_mark_client_ready(clp, -EIO); >>> + shutdown_client(clp->cl_rpcclient); >>> +} >>> + >> >> Isn't the upshot of this that a mount won't actually get shutdown >> until >> you shutdown all of the mounts to the same server? Personally, I find >> that acceptable, but we should note that it is a change in behavior. > > It means that the other mounts to the same server won't inevitably and > irrevocably lock up. I'm unhappy that I missed this bug when I applied > the patch, but that's not a reason to not fix it. > > As I said in the above changelog, the nfs_client's RPC client is there > to manage state for all mounts to the same server in the same network > namespace. If you shut down that client while there are still mounts > that depend on it, then not only have you shot yourself in the foot, > but you're going to spend a lot of electrons to just loop madly when > those other mounts need to recover state but can't because you've > permanently shut down the only way for recovery threads to communicate > with the server. The only use for shutdown so far had been when the server is permanently unavailable, so breaking mounts to the same server didn't matter. I agree this fix is the right thing to do now. Reviewed-by: Benjamin Coddington <bcodding@xxxxxxxxxx> Ben