On 3/28/25 7:29 PM, Tom Talpey wrote: > On 3/28/2025 5:53 PM, NeilBrown wrote: >> On Sat, 29 Mar 2025, Olga Kornievskaia wrote: >>> On Thu, Mar 27, 2025 at 9:43 PM NeilBrown <neilb@xxxxxxx> wrote: >>>> >>>> On Fri, 28 Mar 2025, Olga Kornievskaia wrote: >>>>> On Thu, Mar 27, 2025 at 7:54 PM NeilBrown <neilb@xxxxxxx> wrote: >>>>>> >>>>>> On Sat, 22 Mar 2025, Olga Kornievskaia wrote: >>>>>>> NLM locking calls need to pass thru file permission checking >>>>>>> and for that prior to calling inode_permission() we need >>>>>>> to set appropriate access mask. >>>>>>> >>>>>>> Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK") >>>>>>> Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx> >>>>>>> --- >>>>>>> fs/nfsd/vfs.c | 7 +++++++ >>>>>>> 1 file changed, 7 insertions(+) >>>>>>> >>>>>>> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c >>>>>>> index 4021b047eb18..7928ae21509f 100644 >>>>>>> --- a/fs/nfsd/vfs.c >>>>>>> +++ b/fs/nfsd/vfs.c >>>>>>> @@ -2582,6 +2582,13 @@ nfsd_permission(struct svc_cred *cred, >>>>>>> struct svc_export *exp, >>>>>>> if ((acc & NFSD_MAY_TRUNC) && IS_APPEND(inode)) >>>>>>> return nfserr_perm; >>>>>>> >>>>>>> + /* >>>>>>> + * For the purpose of permission checking of NLM requests, >>>>>>> + * the locker must have READ access or own the file >>>>>>> + */ >>>>>>> + if (acc & NFSD_MAY_NLM) >>>>>>> + acc = NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE; >>>>>>> + >>>>>> >>>>>> I don't agree with this change. >>>>>> The only time that NFSD_MAY_NLM is set, NFSD_MAY_OWNER_OVERRIDE is >>>>>> also >>>>>> set. So that part of the change adds no value. >>>>>> >>>>>> This change only affects the case where a write lock is being >>>>>> requested. >>>>>> In that case acc will contains NFSD_MAY_WRITE but not NFSD_MAY_READ. >>>>>> This change will set NFSD_MAY_READ. Is that really needed? >>>>>> >>>>>> Can you please describe the particular problem you saw that is >>>>>> fixed by >>>>>> this patch? If there is a problem and we do need to add >>>>>> NFSD_MAY_READ, >>>>>> then I would rather it were done in nlm_fopen(). >>>>> >>>>> set export policy with (sec=krb5:...) then mount with sec=krb5,vers=3, >>>>> then ask for an exclusive flock(), it would fail. >>>>> >>>>> The reason it fails is because nlm_fopen() translates lock to open >>>>> with WRITE. Prior to patch 4cc9b9f2bf4d, the access would be set to >>>>> acc = NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE; before calling into >>>>> inode_permission(). The patch changed it and lead to lock no longer >>>>> being given out with sec=krb5 policy. >>>> >>>> And do you have WRITE access to the file? >>>> >>>> check_fmode_for_setlk() in fs/locks.c suggests that for F_WRLCK to be >>>> granted the file must be open for FMODE_WRITE. >>>> So when an exclusive lock request arrives via NLM, nlm_lookup_file() >>>> calls nlm_do_fopen() with a mode of O_WRONLY and that causes >>>> nfsd_permission() to check that the caller has write access to the >>>> file. >>>> >>>> So if you are trying to get an exclusive lock to a file that you don't >>>> have write access to, then it should fail. >>>> If, however, you do have write access to the file - I cannot see why >>>> asking for NFSD_MAY_READ instead of NFSD_MAY_WRITE would help. >>> >>> That's correct, the user doing flock() does NOT have write access to >>> the file. Yet prior to the 4cc9b9f2bf4d, that access was allowed. If >>> that was a bug then my bad. I assumed it was regression. >>> >>> It's interesting to me that on an XFS file system, I can create a file >>> owned by root (on a local filesystem) and then request an exclusive >>> lock on it (as a user -- no write permissions). >> >> "flock" is the missing piece. I always thought it was a little odd >> implementing flock locks over NFS using byte-range locking. Not >> necessarily wrong, but definitely odd. >> >> The man page for fcntl says >> >> In order to place a read lock, fd must be open for reading. In order >> to place a write lock, fd must be open for writing. To place both >> types of lock, open a file read-write. >> >> So byte-range locks require a consistent open mode. >> >> The man page for flock says >> >> A shared or exclusive lock can be placed on a file regardless of the >> mode in which the file was opened. >> >> Since the NFS client started using NLM (or v4 LOCK) for flock requests, >> we cannot know if a request is flock or fcntl so we cannot check the >> "correct" permissions. We have to rely on the client doing the >> permission checking. >> >> So it isn't really correct to check for either READ or WRITE. > > Just one thing to mention, newer versions of the flock(2) manpage do > mention the NFS/NLM behavior w.r.t. open for writing: > > Since Linux 2.6.12, NFS clients support flock() locks by emulating > them as fcntl(2) byte-range locks on the entire file. This means > that fcntl(2) and flock() locks do interact with one another over > NFS. It also means that in order to place an exclusive lock, the > file must be opened for writing. > > Not sure this solves the question, but it's "documented". The text > should maybe be revisited either way. Thanks, Neil and Tom, for digging this out. I agree that the new code comment should explicitly mention that this logic is necessary due to our NFSv3 implementation emulating flock() with fcntl() byte-range locks. > Tom. > >> This is awkward because nfsd doesn't just check permissions. It has to >> open the file and say what mode it is opening for. This is apparently >> important when re-exporting NFS according to >> >> Commit: 7f024fcd5c97 ("Keep read and write fds with each nlm_file") >> >> So if you try an exclusive flock on a re-exported NFS file (reexported >> over v3) that you have open for READ but do not have write permission >> for, then the client will allow it, but the intermediate server will try >> a O_WRITE open which the final server will reject. >> (does re-export work over v3??) >> >> There is no way to make this "work". As I said: sending flock requests >> over NFS was an interesting choice. >> For v4 re-export it isn't a problem because the intermediate server >> knows what mode the file was opened for on the client. >> >> So what do we do? Whatever we do needs a comment explaining that flock >> vs fcntl is the problem. >> Possibly we should not require read or write access - and just trust the >> client. Alternately we could stick with the current practice of >> requiring READ but not WRITE - it would be rare to lock a file which you >> don't have read access to. >> >> So yes: we do need a patch here. I would suggest something like: >> >> /* An NLM request may be from fcntl() which requires the open mode to >> * match to lock mode or may be from flock() which allows any lock mode >> * with any open mode. "acc" here indicates the lock mode but we must >> * do permission check reflecting the open mode which we cannot know. >> * For simplicity and historical continuity, always only check for >> * READ access >> */ >> if (acc & NFSD_MAY_NLM) >> acc = (acc & ~NFSD_MAY_WRITE) | NFSD_MAY_READ; >> >> I'd prefer to leave the MAY_OWNER_OVERRIDE setting in nlm_fopen(). >> >> Thanks, >> NeilBrown >> > -- Chuck Lever