On Fri, Mar 28, 2025 at 5:53 PM NeilBrown <neilb@xxxxxxx> wrote: > > On Sat, 29 Mar 2025, Olga Kornievskaia wrote: > > On Thu, Mar 27, 2025 at 9:43 PM NeilBrown <neilb@xxxxxxx> wrote: > > > > > > On Fri, 28 Mar 2025, Olga Kornievskaia wrote: > > > > On Thu, Mar 27, 2025 at 7:54 PM NeilBrown <neilb@xxxxxxx> wrote: > > > > > > > > > > On Sat, 22 Mar 2025, Olga Kornievskaia wrote: > > > > > > NLM locking calls need to pass thru file permission checking > > > > > > and for that prior to calling inode_permission() we need > > > > > > to set appropriate access mask. > > > > > > > > > > > > Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK") > > > > > > Signed-off-by: Olga Kornievskaia <okorniev@xxxxxxxxxx> > > > > > > --- > > > > > > fs/nfsd/vfs.c | 7 +++++++ > > > > > > 1 file changed, 7 insertions(+) > > > > > > > > > > > > diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c > > > > > > index 4021b047eb18..7928ae21509f 100644 > > > > > > --- a/fs/nfsd/vfs.c > > > > > > +++ b/fs/nfsd/vfs.c > > > > > > @@ -2582,6 +2582,13 @@ nfsd_permission(struct svc_cred *cred, struct svc_export *exp, > > > > > > if ((acc & NFSD_MAY_TRUNC) && IS_APPEND(inode)) > > > > > > return nfserr_perm; > > > > > > > > > > > > + /* > > > > > > + * For the purpose of permission checking of NLM requests, > > > > > > + * the locker must have READ access or own the file > > > > > > + */ > > > > > > + if (acc & NFSD_MAY_NLM) > > > > > > + acc = NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE; > > > > > > + > > > > > > > > > > I don't agree with this change. > > > > > The only time that NFSD_MAY_NLM is set, NFSD_MAY_OWNER_OVERRIDE is also > > > > > set. So that part of the change adds no value. > > > > > > > > > > This change only affects the case where a write lock is being requested. > > > > > In that case acc will contains NFSD_MAY_WRITE but not NFSD_MAY_READ. > > > > > This change will set NFSD_MAY_READ. Is that really needed? > > > > > > > > > > Can you please describe the particular problem you saw that is fixed by > > > > > this patch? If there is a problem and we do need to add NFSD_MAY_READ, > > > > > then I would rather it were done in nlm_fopen(). > > > > > > > > set export policy with (sec=krb5:...) then mount with sec=krb5,vers=3, > > > > then ask for an exclusive flock(), it would fail. > > > > > > > > The reason it fails is because nlm_fopen() translates lock to open > > > > with WRITE. Prior to patch 4cc9b9f2bf4d, the access would be set to > > > > acc = NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE; before calling into > > > > inode_permission(). The patch changed it and lead to lock no longer > > > > being given out with sec=krb5 policy. > > > > > > And do you have WRITE access to the file? > > > > > > check_fmode_for_setlk() in fs/locks.c suggests that for F_WRLCK to be > > > granted the file must be open for FMODE_WRITE. > > > So when an exclusive lock request arrives via NLM, nlm_lookup_file() > > > calls nlm_do_fopen() with a mode of O_WRONLY and that causes > > > nfsd_permission() to check that the caller has write access to the file. > > > > > > So if you are trying to get an exclusive lock to a file that you don't > > > have write access to, then it should fail. > > > If, however, you do have write access to the file - I cannot see why > > > asking for NFSD_MAY_READ instead of NFSD_MAY_WRITE would help. > > > > That's correct, the user doing flock() does NOT have write access to > > the file. Yet prior to the 4cc9b9f2bf4d, that access was allowed. If > > that was a bug then my bad. I assumed it was regression. > > > > It's interesting to me that on an XFS file system, I can create a file > > owned by root (on a local filesystem) and then request an exclusive > > lock on it (as a user -- no write permissions). > > "flock" is the missing piece. I always thought it was a little odd > implementing flock locks over NFS using byte-range locking. Not > necessarily wrong, but definitely odd. > > The man page for fcntl says > > In order to place a read lock, fd must be open for reading. In order > to place a write lock, fd must be open for writing. To place both > types of lock, open a file read-write. > > So byte-range locks require a consistent open mode. > > The man page for flock says > > A shared or exclusive lock can be placed on a file regardless of the > mode in which the file was opened. > > Since the NFS client started using NLM (or v4 LOCK) for flock requests, > we cannot know if a request is flock or fcntl so we cannot check the > "correct" permissions. We have to rely on the client doing the > permission checking. > > So it isn't really correct to check for either READ or WRITE. > > This is awkward because nfsd doesn't just check permissions. It has to > open the file and say what mode it is opening for. This is apparently > important when re-exporting NFS according to > > Commit: 7f024fcd5c97 ("Keep read and write fds with each nlm_file") > > So if you try an exclusive flock on a re-exported NFS file (reexported > over v3) that you have open for READ but do not have write permission > for, then the client will allow it, but the intermediate server will try > a O_WRITE open which the final server will reject. > (does re-export work over v3??) > > There is no way to make this "work". As I said: sending flock requests > over NFS was an interesting choice. > For v4 re-export it isn't a problem because the intermediate server > knows what mode the file was opened for on the client. > > So what do we do? Whatever we do needs a comment explaining that flock > vs fcntl is the problem. > Possibly we should not require read or write access - and just trust the > client. Alternately we could stick with the current practice of > requiring READ but not WRITE - it would be rare to lock a file which you > don't have read access to. > > So yes: we do need a patch here. I would suggest something like: > > /* An NLM request may be from fcntl() which requires the open mode to > * match to lock mode or may be from flock() which allows any lock mode > * with any open mode. "acc" here indicates the lock mode but we must > * do permission check reflecting the open mode which we cannot know. > * For simplicity and historical continuity, always only check for > * READ access > */ > if (acc & NFSD_MAY_NLM) > acc = (acc & ~NFSD_MAY_WRITE) | NFSD_MAY_READ; This code would also make the behaviour consistent with prior to 4cc9b9f2bf4d. But now I question whether or not the new behaviour is what is desired going forward or not? Here's another thing to consider: the same command done over nfsv4 returns an error. I guess nobody ever complained that flock over v3 was successful but failed over v4? > I'd prefer to leave the MAY_OWNER_OVERRIDE setting in nlm_fopen(). > > Thanks, > NeilBrown