Re: [PATCH 2/6] VFS/ovl: add lookup_one_positive_killable()

"NeilBrown" <neilb@xxxxxxxxxxx> · Tue, 09 Sep 2025 12:14:07 +1000

On Mon, 08 Sep 2025, Amir Goldstein wrote:
> On Mon, Sep 8, 2025 at 4:07 AM NeilBrown <neilb@xxxxxxxxxxx> wrote:
> >
> > On Sat, 06 Sep 2025, Amir Goldstein wrote:
> > > On Sat, Sep 6, 2025 at 7:00 AM NeilBrown <neilb@xxxxxxxxxxx> wrote:
> > > >
> > > > From: NeilBrown <neil@xxxxxxxxxx>
> > > >
> > > > ovl wants a lookup which won't block on a fatal signal.
> > > > It currently uses down_write_killable() and then repeated
> > > > calls to lookup_one()
> > > >
> > > > The lock may not be needed if the name is already in the dcache and it
> > > > aid proposed future changes if the locking is kept internal to namei.c
> > > >
> > > > So this patch adds lookup_one_positive_killable() which is like
> > > > lookup_one_positive() but will abort in the face of a fatal signal.
> > > > overlayfs is changed to use this.
> > > >
> > > > Signed-off-by: NeilBrown <neil@xxxxxxxxxx>
> > >
> > > I think the commit should mention that this changes from
> > > inode_lock_killable() to inode_lock_shared_killable() on the
> > > underlying dir inode which is a good thing for this scope.
> > >
> > > BTW I was reading the git history that led to down_write_killable()
> > > in this code and I had noticed that commit 3e32715496707
> > > ("vfs: get rid of old '->iterate' directory operation") has made
> > > the ovl directory iteration non-killable when promoting the read
> > > lock on the ovl directory to write lock.
> >
> > hmmmm....
> >
> > So the reason that this uses a killable lock is simply because it used
> > to happen under readdir and readdir uses a killable lock.  Is that
> > right?
> 
> I think the semantics were copied from readdir of that moment yes.
> 
> >
> > So there is no particularly reason that "killable" is important here?
> 
> I can think of some reasons -
> Maybe overlayfs (ever user mounted overlayfs) has just one process
> accessing it but underlying lower layer is remote fs with many processes
> accessing it so chances of lower layer dir lock being held by another thread
> are much higher than chances of overlayfs dir lock being held.
> 
> > So I could simply change it to use lookup_one_positive() and you
> > wouldn't mind?
> >
> 
> I do mind and prefer that you keep this killable as you patch does.
> The more important reason to keep this killable IMO is that we can and
> should make overlayfs readdir shared lock one day.

Fair enough - I will persist with my current patch.  I just wanted to be
sure it was wanted.

> 
> > I'd actually like to make all directory/dentry locking killable - I
> > don't think there is any downside.  But I don't want to try pushing that
> > until my current exercise is finished.
> >
> 
> The path to making overlayfs readdir shared and killable is
> to move the synchronization of ovl readdir cache and
> OVL_I(inode)->version from the implicit vfs inode_lock() to
> explicit ovl_inode_lock().
> 
> The mechanical change is easy.
> My concern is from hidden assumptions in the code that
> I am not aware of, ones which are not annotated with
> inode_is_locked() like ovl_inode_version_get() and
> ovl_dir_version_inc() are.
> 
> And the fact that noone has yet to complain about overlayfs readdir
> scalability makes this conversion non urgent.
> 
> If you have other reasons to want to make ovl readdir killable
> or shared, we can look into that.

readdir locking is not on my radar.  We need to keep it to ensure
exclusion with rmdir (as we need a counter of current readdir threads
and there is nowhere else suitable to store a counter even if I wanted
to).

And readdir is already non-exclusive and would not benefit from
asynchrony (as far as I can see), so there is no need to change it.

I might find improvements to ovl readdir locking interesting, but not at
all a priority.

Thanks,
NeilBrown