Fixes: 260f32adb88 ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect") When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a connection to DS, client ends in an infinite loop of connect-disconnect. This source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an error on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by rpc_signal_task, but the error is treated as transient, thus retried. The issue is reproducible with script as (there should be ~1000 files in a directory, client should must not have any connections to DSes): ``` echo 3 > /proc/sys/vm/drop_caches for i in * do head -1 $i & PP=$! sleep 10e-03 kill -TERM $PP done ``` Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@xxxxxxx> --- fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c b/fs/nfs/flexfilelayout/flexfilelayoutdev.c index 4a304cf17c4b..0008a8180c9b 100644 --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c @@ -410,6 +410,10 @@ nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg, mirror->mirror_ds->ds_versions[0].wsize = max_payload; goto out; } + /* There is a fatal error to connect to DS. Mark it unavailable to avoid infinite retry loop. */ + if (nfs_error_is_fatal(status)) + nfs4_mark_deviceid_unavailable(&mirror->mirror_ds->id_node); + noconnect: ff_layout_track_ds_error(FF_LAYOUT_FROM_HDR(lseg->pls_layout), mirror, lseg->pls_range.offset, -- 2.49.0