On 8/21/25 2:33 PM, Olga Kornievskaia wrote: > On Thu, Aug 21, 2025 at 2:24 PM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: >> >> On 8/21/25 2:20 PM, Olga Kornievskaia wrote: >>> On Thu, Aug 21, 2025 at 11:09 AM Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: >>>> >>>> On 8/21/25 9:56 AM, Olga Kornievskaia wrote: >>>>> On Wed, Aug 20, 2025 at 7:15 PM NeilBrown <neil@xxxxxxxxxx> wrote: >>>>>> >>>>>> On Thu, 14 Aug 2025, Olga Kornievskaia wrote: >>>>>>> On Tue, Aug 12, 2025 at 8:05 PM NeilBrown <neil@xxxxxxxxxx> wrote: >>>>>>>> >>>>>>>> On Wed, 13 Aug 2025, Olga Kornievskaia wrote: >>>>>>>>> When nfsd is in grace and receives an NLM LOCK request which turns >>>>>>>>> out to have a conflicting delegation, return that the server is in >>>>>>>>> grace. >>>>>>>>> >>>>>>>>> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx> >>>>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx> >>>>>>>>> --- >>>>>>>>> fs/lockd/svc4proc.c | 15 +++++++++++++-- >>>>>>>>> 1 file changed, 13 insertions(+), 2 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/fs/lockd/svc4proc.c b/fs/lockd/svc4proc.c >>>>>>>>> index 109e5caae8c7..7ac4af5c9875 100644 >>>>>>>>> --- a/fs/lockd/svc4proc.c >>>>>>>>> +++ b/fs/lockd/svc4proc.c >>>>>>>>> @@ -141,8 +141,19 @@ __nlm4svc_proc_lock(struct svc_rqst *rqstp, struct nlm_res *resp) >>>>>>>>> resp->cookie = argp->cookie; >>>>>>>>> >>>>>>>>> /* Obtain client and file */ >>>>>>>>> - if ((resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file))) >>>>>>>>> - return resp->status == nlm_drop_reply ? rpc_drop_reply :rpc_success; >>>>>>>>> + resp->status = nlm4svc_retrieve_args(rqstp, argp, &host, &file); >>>>>>>>> + switch (resp->status) { >>>>>>>>> + case 0: >>>>>>>>> + break; >>>>>>>>> + case nlm_drop_reply: >>>>>>>>> + if (locks_in_grace(SVC_NET(rqstp))) { >>>>>>>>> + resp->status = nlm_lck_denied_grace_period; >>>>>>>> >>>>>>>> I think this is wrong. If the lock request has the "reclaim" flag set, >>>>>>>> then nlm_lck_denied_grace_period is not a meaningful error. >>>>>>>> nlm4svc_retrieve_args() returns nlm_drop_reply when there is a delay >>>>>>>> getting a response to an upcall to mountd. For NLM the request really >>>>>>>> must be dropped. >>>>>>> >>>>>>> Thank you for pointing out this case so we are suggesting to. >>>>>>> >>>>>>> if (locks_in_grace(SVC_NET(rqstp)) && !argp->reclaim) >>>>>>> >>>>>>> However, I've been looking and looking but I cannot figure out how >>>>>>> nlm4svc_retrieve_args() would ever get nlm_drop_reply. You say it can >>>>>>> happen during the upcall to mountd. So that happens within nfsd_open() >>>>>>> call and a part of fh_verify(). >>>>>>> To return nlm_drop_reply, nlm_fopen() must have gotten nfserr_dropit >>>>>>> from the nfsd_open(). I have searched and searched but I don't see >>>>>>> anything that ever sets nfserr_dropit (NFSERR_DROPIT). >>>>>>> >>>>>>> I searched the logs and nfserr_dropit was an error that was EAGAIN >>>>>>> translated into but then removed by the following patch. >>>>>> >>>>>> Oh. I didn't know that. >>>>>> We now use RQ_DROPME instead. >>>>>> I guess we should remove NFSERR_DROPIT completely as it not used at all >>>>>> any more. >>>>>> >>>>>> Though returning nfserr_jukebox to an v2 client isn't a good idea.... >>>>> >>>>> I'll take your word for you. >>>>> >>>>>> So I guess my main complaint isn't valid, but I still don't like this >>>>>> patch. It seems an untidy place to put the locks_in_grace test. >>>>>> Other callers of nlm4svc_retrieve_args() check locks_in_grace() before >>>>>> making that call. __nlm4svc_proc_lock probably should too. Or is there >>>>>> a reason that it is delayed until the middle of nlmsvc_lock().. >>>>> >>>>> I thought the same about why not adding the in_grace check and decided >>>>> that it was probably because you dont want to deny a lock if there are >>>>> no conflicts. If we add it, somebody might notice and will complain >>>>> that they can't get their lock when there are no conflicts. >>>>> >>>>>> The patch is not needed and isn't clearly an improvement, so I would >>>>>> rather it were dropped. >>>>> >>>>> I'm not against dropping this patch if the conclusion is that dropping >>>>> the packet is no worse than returning in_grace error. >>>> >>>> I dropped both of these from nfsd-testing. If a fix is still needed, >>>> let's start again with fresh patches. >>> >>> Can you clarify when you said "both"? >>> >>> What objections are there for the 1st patch in the series. It solves a >>> problem and a fix is needed. >> >> There are two reasons to drop the first patch. >> >> 1. Neil's "remove nfserr_dropit" patch doesn't apply unless both patches >> are reverted. >> >> 2. As I said above, NFSv2 does not have a mechanism like NFS3ERR_JUKEBOX >> to request that the client wait a bit and resend. > > ERR_JUKEBOX is not returned to another v2 or v3. > > Patch1 (nfsd: nfserr_jukebox in nlm_fopen should lead to a retry) > translates err_jukebox into the nlm_drop_reply and returns to lockd. > As the result, no error is returned to the client but the reply is > dropped all together. If you want NLM to drop the response, then set RQ_DROPME. Using nfserr_jukebox here is confusing -- it means "return a response to the client that requests a resend". You want NLM to /not send a response/, and we have a specific mechanism for that. >> So, if 1/2 has been tested with NFSv2 and does not cause NFSD to leak >> nfserr_jukebox to NFSv2 clients, then please rebase that one on the >> current nfsd-testing branch and post it again. >> >> >>> This one I agree is not needed but I >>> thought was an improvement. >>> >>>> >>>> >>>>>> Thanks, >>>>>> NeilBrown >>>>>> >>>>>> >>>>>>> >>>>>>> commit 062304a815fe10068c478a4a3f28cf091c55cb82 >>>>>>> Author: J. Bruce Fields <bfields@xxxxxxxxxxxx> >>>>>>> Date: Sun Jan 2 22:05:33 2011 -0500 >>>>>>> >>>>>>> nfsd: stop translating EAGAIN to nfserr_dropit >>>>>>> >>>>>>> diff --git a/fs/nfsd/nfsproc.c b/fs/nfsd/nfsproc.c >>>>>>> index dc9c2e3fd1b8..fd608a27a8d5 100644 >>>>>>> --- a/fs/nfsd/nfsproc.c >>>>>>> +++ b/fs/nfsd/nfsproc.c >>>>>>> @@ -735,7 +735,8 @@ nfserrno (int errno) >>>>>>> { nfserr_stale, -ESTALE }, >>>>>>> { nfserr_jukebox, -ETIMEDOUT }, >>>>>>> { nfserr_jukebox, -ERESTARTSYS }, >>>>>>> - { nfserr_dropit, -EAGAIN }, >>>>>>> + { nfserr_jukebox, -EAGAIN }, >>>>>>> + { nfserr_jukebox, -EWOULDBLOCK }, >>>>>>> { nfserr_jukebox, -ENOMEM }, >>>>>>> { nfserr_badname, -ESRCH }, >>>>>>> { nfserr_io, -ETXTBSY }, >>>>>>> >>>>>>> so if fh_verify is failing whatever it is returning, it is not >>>>>>> nfserr_dropit nor is it nfserr_jukebox which means nlm_fopen() would >>>>>>> translate it to nlm_failed which with my patch would not trigger >>>>>>> nlm_lck_denied_grace_period error but resp->status would be set to >>>>>>> nlm_failed. >>>>>>> >>>>>>> So I circle back to I hope that convinces you that we don't need a >>>>>>> check for the reclaim lock. >>>>>>> >>>>>>> I believe nlm_drop_reply is nfsd_open's jukebox error, one of which is >>>>>>> delegation recall. it can be a memory failure. But I'm sure when >>>>>>> EWOULDBLOCK occurs. >>>>>>> >>>>>>>> At the very least we need to guard against the reclaim flag being set in >>>>>>>> the above test. But I would much rather a more clear distinction were >>>>>>>> made between "drop because of a conflicting delegation" and "drop >>>>>>>> because of a delay getting upcall response". >>>>>>>> Maybe a new "nlm_conflicting_delegtion" return from ->fopen which nlm4 >>>>>>>> (and ideally nlm2) handles appropriately. >>>>>>> >>>>>>> >>>>>>>> NeilBrown >>>>>>>> >>>>>>>> >>>>>>>>> + return rpc_success; >>>>>>>>> + } >>>>>>>>> + return nlm_drop_reply; >>>>>>>>> + default: >>>>>>>>> + return rpc_success; >>>>>>>>> + } >>>>>>>>> >>>>>>>>> /* Now try to lock the file */ >>>>>>>>> resp->status = nlmsvc_lock(rqstp, file, host, &argp->lock, >>>>>>>>> -- >>>>>>>>> 2.47.1 >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>>> >>>> -- >>>> Chuck Lever >>> >> >> >> -- >> Chuck Lever -- Chuck Lever