Re: [PATCH] sunrpc: don't fail immediately in rpc_wait_bit_killable()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,

On 20/08/25 3:08 AM, NeilBrown wrote:
> rpc_wait_bit_killable() is called when it is appropriate for a fatal
> signal to abort the wait.
>
> If it is called late during process exit after exit_signals() is called
> (and when PF_EXITING is set), it cannot receive a fatal signal so
> waiting indefinitely is not safe.
>
> However aborting immediately, as it currently does, is not ideal as it
> mean that the related NFS request cannot succeed, even if the network
> and server are working properly.
>
> One of the causes of filesystem IO when PF_EXITING is set is
> acct_process() which may access the process accounting file.  For a
> NFS-root configuration, this can be accessed over NFS.
>
> In this configuration LTP test "acct02" fails.
>
> Though waiting indefinitely is not appropriate, aborting immediately is
> also not desirable.  This patch aims for a middle ground of waiting at
> most 5 seconds.  This should be enough when NFS service is working, but
> not so much as to delay process exit excessively when NFS service is not
> functioning.
>
> Reported-by: Mark Brown <broonie@xxxxxxxxxx>
> Reported-and-tested-by: Harshvardhan Jha <harshvardhan.j.jha@xxxxxxxxxx>
> Link: https://urldefense.com/v3/__https://lore.kernel.org/linux-nfs/7d4d57b0-39a3-49f1-8ada-60364743e3b4@xxxxxxxxxxxxx/__;!!ACWV5N9M2RV99hQ!LaRJdjZulcG71nHFWdEAszB9mJEhezxPsDxHO8xeQJ7P8a9UfYNRIm1ziuuHU5wxgEXW14vAqC1dlpSQraWaxA$ 
> Fixes: 14e41b16e8cb ("SUNRPC: Don't allow waiting for exiting tasks")
> Signed-off-by: NeilBrown <neil@xxxxxxxxxx>
> ---
>  net/sunrpc/sched.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> index 73bc39281ef5..92f39e828fbe 100644
> --- a/net/sunrpc/sched.c
> +++ b/net/sunrpc/sched.c
> @@ -276,11 +276,15 @@ EXPORT_SYMBOL_GPL(rpc_destroy_wait_queue);
>  
>  static int rpc_wait_bit_killable(struct wait_bit_key *key, int mode)
>  {
> -	if (unlikely(current->flags & PF_EXITING))
> -		return -EINTR;
> -	schedule();
> -	if (signal_pending_state(mode, current))
> -		return -ERESTARTSYS;
> +	if (unlikely(current->flags & PF_EXITING)) {
> +		/* Cannot be killed by a signal, so don't wait indefinitely */
> +		if (schedule_timeout(5 * HZ) == 0)
> +			return -EINTR;
> +	} else {
> +		schedule();
> +		if (signal_pending_state(mode, current))
> +			return -ERESTARTSYS;
> +	}
>  	return 0;
>  }
>  
Is it possible to get this merged in 6.17? I have tested this and the
LTP tests pass.

Thanks & Regards,
Harshvardhan




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux