On Thu, Aug 21, 2025 at 9:56 AM Olga Kornievskaia <aglo@xxxxxxxxx> wrote: > > On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@xxxxxxxxxx> wrote: > > > > On Thu, 14 Aug 2025, Olga Kornievskaia wrote: > > > On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@xxxxxxxxxx> wrote: > > > > > > > > On Wed, 13 Aug 2025, Olga Kornievskaia wrote: > > > > > When nfsd is in grace and receives an NLM LOCK request which turns > > > > > out to have a conflicting delegation, return that the server is in > > > > > grace. My last attempt for restoring (prior to write delegations) behaviour vs what happens now (even with patch#1). I bring it up again because there has been misunderstanding of what the patch series is doing and I believe (some) technical concerns (regarding nfsv2 client) have been discussed and resolved. In the old code (prior to write delegations), when a v3 client sent a lock request (during grace period) and when a v4 client reclaimed the same lock after the server rebooted, the v3 client got the in_grace error. With patch #1 (not in question here), the code at least isn't failing the request with nlm_failed and instead drops the request. No lock failure but the overall behaviour of the nfs client is different (details below). > > > > > Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx> > > > > > Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx> > > > > > --- > > > > > fs/lockd/svc4proc.c | 15 +++++++++++++-- > > > > > 1 file changed, 13 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c > > > > > index 109e5caae8c7..7ac4af5c9875 100644 > > > > > --- a/fs/lockd/svc4proc.c > > > > > +++ b/fs/lockd/svc4proc.c > > > > > @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp) > > > > > resp->cookie = argp->cookie; > > > > > > > > > > /* Obtain client and file */ > > > > > - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) > > > > > - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success; > > > > > + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file); > > > > > + switch (resp->status) { > > > > > + case 0: > > > > > + break; > > > > > + case nlm_drop_reply: > > > > > + if (locks_in_grace(SVC_NET(rqstp))) { > > > > > + resp->status = nlm_lck_denied_grace_period; > > > > > > > > I think this is wrong. If the lock request has the "reclaim" flag set, > > > > then nlm_lck_denied_grace_period is not a meaningful error. > > > > nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay > > > > getting a response to an upcall to mountd. For NLM the request really > > > > must be dropped. > > > > > > Thank you for pointing out this case so we are suggesting to. > > > > > > if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim) > > > > > > However, I've been looking and looking but I cannot figure out how > > > nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can > > > happen during the upcall to mountd. So that happens within nfsd_open() > > > call and a part of fh_verify(). > > > To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit > > > from the nfsd_open(). I have searched and searched but I don't see > > > anything that ever sets nfserr_dropit (NFSERR_DROPIT). > > > > > > I searched the logs and nfserr_dropit was an error that was EAGAIN > > > translated into but then removed by the following patch. > > > > Oh. I didn't know that. > > We now use RQ_DROPME instead. > > I guess we should remove NFSERR_DROPIT completely as it not used at all > > any more. > > > > Though returning nfserr_jukebox to an v2 client isn't a good idea.... > > I'll take your word for you. I didn't write the correct words here. I have said the correct words in the other patch. v2 (or v3) client does not get an nfserr_jukebox. > > So I guess my main complaint isn't valid, but I still don't like this > > patch. It seems an untidy place to put the locks_in_grace test. > > Other callers of nlm4svc_retrieve_args() check locks_in_grace() before > > making that call. __nlm4svc_proc_lock probably should too. Or is there > > a reason that it is delayed until the middle of nlmsvc_lock().. > > I thought the same about why not adding the in_grace check and decided > that it was probably because you dont want to deny a lock if there are > no conflicts. If we add it, somebody might notice and will complain > that they can't get their lock when there are no conflicts. The disadvantage of unconditionally adding in_grace check() in the lock call is that again it would produce a difference in behaviour where some lock would not be granted as fast as before (ie., some environments might notice the change?). > > The patch is not needed and isn't clearly an improvement, so I would > > rather it were dropped. > > I'm not against dropping this patch if the conclusion is that dropping > the packet is no worse than returning in_grace error. My argument for inclusion of this patch is that it restores previous behaviour that the nfs client experienced before (ie., receiving in_grace error). When we rely on drop-retry behavior, can there be problems that the client runs out of re-tries and fails the request (nlm has a hard coded retry limit of 5 and then its own timeout value)? I admit my reasons are hypothetical, we can wait until somebody complains. > > Thanks, > > NeilBrown > > > > > > > > > > commit 062304a815fe10068c478a4a3f28cf091c55cb82 > > > Author: J. Bruce Fields <bfields@xxxxxxxxxxxx> > > > Date: Sun Jan 2 22:05:33 2011 -0500 > > > > > > nfsd: stop translating EAGAIN to nfserr_dropit > > > > > > diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c > > > index dc9c2e3fd1b8..fd608a27a8d5 100644 > > > --- a/fs/nfsd/nfsproc.c > > > +++ b/fs/nfsd/nfsproc.c > > > @@ -735,7 +735,8 @@ nfserrno (int errno) > > > { nfserr_stale, -ESTALE }, > > > { nfserr_jukebox, -ETIMEDOUT }, > > > { nfserr_jukebox, -ERESTARTSYS }, > > > - { nfserr_dropit, -EAGAIN }, > > > + { nfserr_jukebox, -EAGAIN }, > > > + { nfserr_jukebox, -EWOULDBLOCK }, > > > { nfserr_jukebox, -ENOMEM }, > > > { nfserr_badname, -ESRCH }, > > > { nfserr_io, -ETXTBSY }, > > > > > > so if fh_verify is failing whatever it is returning, it is not > > > nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would > > > translate it to nlm_failed which with my patch would not trigger > > > nlm_lck_denied_grace_period error but resp->status would be set to > > > nlm_failed. > > > > > > So I circle back to I hope that convinces you that we don't need a > > > check for the reclaim lock. > > > > > > I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is > > > delegation recall. it can be a memory failure. But I'm sure when > > > EWOULDBLOCK occurs. > > > > > > > At the very least we need to guard against the reclaim flag being set in > > > > the above test. But I would much rather a more clear distinction were > > > > made between "drop because of a conflicting delegation" and "drop > > > > because of a delay getting upcall response". > > > > Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4 > > > > (and ideally nlm2) handles appropriately. > > > > > > > > > > NeilBrown > > > > > > > > > > > > > + return rpc_success; > > > > > + } > > > > > + return nlm_drop_reply; > > > > > + default: > > > > > + return rpc_success; > > > > > + } > > > > > > > > > > /* Now try to lock the file */ > > > > > resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock, > > > > > -- > > > > > 2.47.1 > > > > > > > > > > > > > > > > > > > > > > >