Re: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Folks,

Do you have any opinion on this one? Would you like me to address it differently?

Tigran. 

----- Original Message -----
> From: "Tigran Mkrtchyan" <tigran.mkrtchyan@xxxxxxx>
> To: "linux-nfs" <linux-nfs@xxxxxxxxxxxxxxx>
> Cc: "Trond Myklebust" <trondmy@xxxxxxxxxx>, "Anna Schumaker" <anna@xxxxxxxxxx>, "Tigran Mkrtchyan"
> <tigran.mkrtchyan@xxxxxxx>
> Sent: Monday, 9 June, 2025 23:43:03
> Subject: [PATCH 1/1] pNFS/flexfiles: mark device unavailable on fatal connection error

> Fixes: 260f32adb88 ("pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect")
> 
> When an applications get killed (SIGTERM/SIGINT) while pNFS client performs a
> connection
> to DS, client ends in an infinite loop of connect-disconnect. This
> source of the issue, it that flexfilelayoutdev#nfs4_ff_layout_prepare_ds gets an
> error
> on nfs4_pnfs_ds_connect with status ERESTARTSYS, which is set by
> rpc_signal_task, but
> the error is treated as transient, thus retried.
> 
> The issue is reproducible with script as (there should be ~1000 files in
> a directory, client should must not have any connections to DSes):
> 
> ```
> echo 3 > /proc/sys/vm/drop_caches
> 
> for i in *
> do
>        head -1 $i &
>        PP=$!
>        sleep 10e-03
>        kill -TERM $PP
> done
> ```
> 
> Signed-off-by: Tigran Mkrtchyan <tigran.mkrtchyan@xxxxxxx>
> ---
> fs/nfs/flexfilelayout/flexfilelayoutdev.c | 4 ++++
> 1 file changed, 4 insertions(+)
> 
> diff --git a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
> b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
> index 4a304cf17c4b..0008a8180c9b 100644
> --- a/fs/nfs/flexfilelayout/flexfilelayoutdev.c
> +++ b/fs/nfs/flexfilelayout/flexfilelayoutdev.c
> @@ -410,6 +410,10 @@ nfs4_ff_layout_prepare_ds(struct pnfs_layout_segment *lseg,
> 			mirror->mirror_ds->ds_versions[0].wsize = max_payload;
> 		goto out;
> 	}
> +	/* There is a fatal error to connect to DS. Mark it unavailable to avoid
> infinite retry loop. */
> +	if (nfs_error_is_fatal(status))
> +		nfs4_mark_deviceid_unavailable(&mirror->mirror_ds->id_node);
> +
> noconnect:
> 	ff_layout_track_ds_error(FF_LAYOUT_FROM_HDR(lseg->pls_layout),
> 				 mirror, lseg->pls_range.offset,
> --
> 2.49.0

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux