Re: [PATCH] sunrpc: don't fail immediately in rpc_wait_bit_killable()

"NeilBrown" <neil@xxxxxxxxxx> · Fri, 29 Aug 2025 11:04:48 +1000

On Thu, 28 Aug 2025, Trond Myklebust wrote:
> On Thu, 2025-08-28 at 18:12 +0530, Harshvardhan Jha wrote:
> > Hi there,
> > 
> > On 20/08/25 3:08 AM, NeilBrown wrote:
> > > rpc_wait_bit_killable() is called when it is appropriate for a
> > > fatal
> > > signal to abort the wait.
> > > 
> > > If it is called late during process exit after exit_signals() is
> > > called
> > > (and when PF_EXITING is set), it cannot receive a fatal signal so
> > > waiting indefinitely is not safe.
> > > 
> > > However aborting immediately, as it currently does, is not ideal as
> > > it
> > > mean that the related NFS request cannot succeed, even if the
> > > network
> > > and server are working properly.
> > > 
> > > One of the causes of filesystem IO when PF_EXITING is set is
> > > acct_process() which may access the process accounting file.  For a
> > > NFS-root configuration, this can be accessed over NFS.
> > > 
> > > In this configuration LTP test "acct02" fails.
> > > 
> > > Though waiting indefinitely is not appropriate, aborting
> > > immediately is
> > > also not desirable.  This patch aims for a middle ground of waiting
> > > at
> > > most 5 seconds.  This should be enough when NFS service is working,
> > > but
> > > not so much as to delay process exit excessively when NFS service
> > > is not
> > > functioning.
> > > 
> > > Reported-by: Mark Brown <broonie@xxxxxxxxxx>
> > > Reported-and-tested-by: Harshvardhan Jha
> > > <harshvardhan.j.jha@xxxxxxxxxx>
> > > Link:
> > > https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%2F%2Flore.kernel.org%2Flinux-nfs%2F7d4d57b0-39a3-49f1-8ada-60364743e3b4%40sirena.org.uk%2F__%3B!!ACWV5N9M2RV99hQ!LaRJdjZulcG71nHFWdEAszB9mJEhezxPsDxHO8xeQJ7P8a9UfYNRIm1ziuuHU5wxgEXW14vAqC1dlpSQraWaxA%24&data=05%7C02%7Ctrondmy%40hammerspace.com%7C5463825a86c248e5766c08dde6306312%7C0d4fed5c3a7046fe9430ece41741f59e%7C0%7C0%7C638919817692991630%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=GBRM89CFMk2vKyeN4yKBvjiV9IrODC4tbwMfRSk4Cfc%3D&reserved=0
> > >  
> > > Fixes: 14e41b16e8cb ("SUNRPC: Don't allow waiting for exiting
> > > tasks")
> > > Signed-off-by: NeilBrown <neil@xxxxxxxxxx>
> > > ---
> > >  net/sunrpc/sched.c | 14 +++++++++-----
> > >  1 file changed, 9 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c
> > > index 73bc39281ef5..92f39e828fbe 100644
> > > --- a/net/sunrpc/sched.c
> > > +++ b/net/sunrpc/sched.c
> > > @@ -276,11 +276,15 @@ EXPORT_SYMBOL_GPL(rpc_destroy_wait_queue);
> > >  
> > >  static int rpc_wait_bit_killable(struct wait_bit_key *key, int
> > > mode)
> > >  {
> > > -	if (unlikely(current->flags & PF_EXITING))
> > > -		return -EINTR;
> > > -	schedule();
> > > -	if (signal_pending_state(mode, current))
> > > -		return -ERESTARTSYS;
> > > +	if (unlikely(current->flags & PF_EXITING)) {
> > > +		/* Cannot be killed by a signal, so don't wait
> > > indefinitely */
> > > +		if (schedule_timeout(5 * HZ) == 0)
> > > +			return -EINTR;
> > > +	} else {
> > > +		schedule();
> > > +		if (signal_pending_state(mode, current))
> > > +			return -ERESTARTSYS;
> > > +	}
> > >  	return 0;
> > >  }
> > >  
> > Is it possible to get this merged in 6.17? I have tested this and the
> > LTP tests pass
> 
> If we are in this situation, it is because some signal has already
> killed the parent task. That throws any data integrity guarantees right
> out of the window.

Or it could be because the task has exited normally.

> 
> So what problem is this patch trying to fix?

Process accounting writes a record to the accounting file when a process
exits.  It writes this record from the context of the process that is
exiting.  It does this after signals have been disabled.

So this is not related to data integrity for any data that the process
wrote.  I believe that is all handled correctly, partly because writes
and close are performed asynchronously and partly because nfs_wb_all()
ultimately uses a non-killable wait.

It is only related to writing to the process accounting file.

Thanks,
NeilBrown

> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trondmy@xxxxxxxxxx, trond.myklebust@xxxxxxxxxxxxxxx
>