On Thu 12-06-25 15:55:40, Mateusz Guzik wrote: > On Thu, Jun 12, 2025 at 10:41:01AM +0100, Luis Henriques wrote: > > The assert in function file_seek_cur_needs_f_lock() can be triggered very > > easily because, as Jan Kara suggested, the file reference may get > > incremented after checking it with fdget_pos(). > > > > Fixes: da06e3c51794 ("fs: don't needlessly acquire f_lock") > > Signed-off-by: Luis Henriques <luis@xxxxxxxxxx> > > --- > > Hi Christian, > > > > It wasn't clear whether you'd be queueing this fix yourself. Since I don't > > see it on vfs.git, I decided to explicitly send the patch so that it doesn't > > slip through the cracks. > > > > Cheers, > > -- > > Luis > > > > fs/file.c | 2 -- > > 1 file changed, 2 deletions(-) > > > > diff --git a/fs/file.c b/fs/file.c > > index 3a3146664cf3..075f07bdc977 100644 > > --- a/fs/file.c > > +++ b/fs/file.c > > @@ -1198,8 +1198,6 @@ bool file_seek_cur_needs_f_lock(struct file *file) > > if (!(file->f_mode & FMODE_ATOMIC_POS) && !file->f_op->iterate_shared) > > return false; > > > > - VFS_WARN_ON_ONCE((file_count(file) > 1) && > > - !mutex_is_locked(&file->f_pos_lock)); > > return true; > > } > > fdget_pos() can only legally skip locking if it determines to be in > position where nobody else can operate on the same file obj, meaning > file_count(file) == 1 and it can't go up. Otherwise the lock is taken. > > Or to put it differently, fdget_pos() NOT taking the lock and new refs > showing up later is a bug. I mostly agree and as I've checked again, this indeed seems to be the case as fdget() will increment f_ref if file table is shared with another thread and thus file_needs_f_pos_lock() returns true whenever there are more threads sharing the file table or if the struct file is dupped to another fd. That being said I find the assertion in file_seek_cur_needs_f_lock() misplaced - it just doesn't make sense in that place to me. > I don't believe anything of the sort is happening here. > > Instead, overlayfs is playing games and *NOT* going through fdget_pos(): > > ovl_inode_lock(inode); > realfile = ovl_real_file(file); > [..] > ret = vfs_llseek(realfile, offset, whence); > > Given the custom inode locking around the call, it may be any other > locking is unnecessary and the code happens to be correct despite the > splat. Right and good spotting. That's indeed more likely explanation than mine. Actually custom locking around llseek isn't all that uncommon (mostly for historical reasons AFAIK but that's another story). > I think the safest way out with some future-proofing is to in fact *add* > the locking in ovl_llseek() to shut up the assert -- personally I find > it uneasy there is some underlying file obj flying around. Well, if you grep for vfs_llseek(), you'll see there are much more calls to it in the kernel than overlayfs. These callers outside of fs/read_write.c are responsible for their locking. So I don't think keeping the assert in file_seek_cur_needs_f_lock() makes any sense. If anything I'd be open to putting it in fdput_pos() or something like that. > Even if ultimately the assert has to go, the proposed commit message > does not justify it. I guess the commit message could be improved. Something like: The assert in function file_seek_cur_needs_f_lock() can be triggered very easily because there are many users of vfs_llseek() (such as overlayfs) that do their custom locking around llseek instead of relying on fdget_pos(). Just drop the overzealous assertion. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR