On Tue, 2025-03-25 at 14:04 -0400, Jeff Layton wrote: > On Tue, 2025-03-25 at 12:17 -0400, trondmy@xxxxxxxxxx wrote: > > From: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > > > If a containerised process is killed and causes an ENETUNREACH or > > ENETDOWN error to be propagated to the state manager, then mark the > > nfs_client as being dead so that we don't loop in functions that > > are > > expecting recovery to succeed. > > > > Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> > > --- > > fs/nfs/nfs4state.c | 10 +++++++++- > > 1 file changed, 9 insertions(+), 1 deletion(-) > > > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c > > index 738eb2789266..14ba3f96e6fc 100644 > > --- a/fs/nfs/nfs4state.c > > +++ b/fs/nfs/nfs4state.c > > @@ -2739,7 +2739,15 @@ static void nfs4_state_manager(struct > > nfs_client *clp) > > pr_warn_ratelimited("NFS: state manager%s%s failed on > > NFSv4 server %s" > > " with error %d\n", section_sep, section, > > clp->cl_hostname, -status); > > - ssleep(1); > > + switch (status) { > > + case -ENETDOWN: > > + case -ENETUNREACH: > > + nfs_mark_client_ready(clp, -EIO); > > + break; > > Should this be conditional on clnt->cl_netunreach_fatal being true? I should hope not. We shouldn't ever be seeing these errors here if it is false. > > > + default: > > + ssleep(1); > > + break; > > + } > > out_drain: > > memalloc_nofs_restore(memflags); > > nfs4_end_drain_session(clp); > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx