Hi Folks, Do you have any opinion on this one? Would you like me to address it differently? Tigran. ----- Original Message ----- > From: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx> > To: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx> > Cc: "Trond Myklebust" <trondmy@xxxxxxxxxx>, "Anna Schumaker" <anna@xxxxxxxxxx>, "Tigran Mkrtchyan" > <tigran.mkrtchyan@xxxxxxx> > Sent: Monday, 9 June, 2025 23:43:03 > Subject: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error > Fixes: 260f32adb88 ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect") > > When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a > connection > to DS, client ends in an infinite loop of connect-disconnect. This > source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an > error > on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by > rpc_signal_task, but > the error is treated as transient, thus retried. > > The issue is reproducible with script as (there should be ~1000 files in > a directory, client should must not have any connections to DSes): > > ``` > echo 3 > /proc/sys/vm/drop_caches > > for i in * > do > head -1 $i & > PP=$! > sleep 10e-03 > kill -TERM $PP > done > ``` > > Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@xxxxxxx> > --- > fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c > b/fs/nfs/flexfilelayout/flexfilelayoutdev.c > index 4a304cf17c4b..0008a8180c9b 100644 > --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c > +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c > @@ -410,6 +410,10 @@ nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, > mirror->mirror_ds->ds_versions[0].wsize = max_payload; > goto out; > } > + /* There is a fatal error to connect to DS. Mark it unavailable to avoid > infinite retry loop. */ > + if (nfs_error_is_fatal(status)) > + nfs4_mark_deviceid_unavailable(&mirror->mirror_ds->id_node); > + > noconnect: > ff_layout_track_ds_error(FF_LAYOUT_FROM_HDR(lseg->pls_layout), > mirror, lseg->pls_range.offset, > -- > 2.49.0
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature